Hey guys,
I have a node running on an old and slow disk, so slow that the process seems to freeze sometimes
The container is still running but ports 14002 and 28967 are no more responding
I remembered that kube had tcp health check (TCP liveness probe) so I tried to add that with these parameters added to the docker run command :
--health-cmd="bash -c 'exec 3<>/dev/tcp/localhost/14002 || exit 1'" \
--health-interval=1m \
--health-retries=5 \
--health-timeout=10s \
--health-start-period=30m \
For a reason I don’t understand, I got this
/home/user# docker exec -it storagenode bash
root@fc4a9ca0761e:/app# bash -c 'exec 3<>/dev/tcp/localhost/14002 || exit 1'
bash: connect: Connection refused
bash: /dev/tcp/localhost/14002: Connection refused
root@fc4a9ca0761e:/app# echo $?
1
So I have an exit code with error, BUT !
/home/user# docker inspect --format='{{json .State.Health}}' storagenode
{
"Status": "healthy",
"FailingStreak": 0,
"Log": [
{
"Start": "2025-10-01T18:11:54.154565154Z",
"End": "2025-10-01T18:11:55.158906829Z",
"ExitCode": 0,
"Output": ""
},
{
"Start": "2025-10-01T18:12:55.160430966Z",
"End": "2025-10-01T18:12:55.574776418Z",
"ExitCode": 0,
"Output": ""
},
{
"Start": "2025-10-01T18:13:55.577041455Z",
"End": "2025-10-01T18:13:55.639423661Z",
"ExitCode": 0,
"Output": ""
},
{
"Start": "2025-10-01T18:14:55.640117067Z",
"End": "2025-10-01T18:14:55.698302942Z",
"ExitCode": 0,
"Output": ""
},
{
"Start": "2025-10-01T18:15:55.699620444Z",
"End": "2025-10-01T18:15:55.749854431Z",
"ExitCode": 0,
"Output": ""
}
]
}
Docker is like “yeah, everything is alright”
Do you have any better idea ?
Or a better health check ?
I already have a script that restart unhealthy nodes, just waiting for the node to go into this state
Edit : I gave a try with an apache container, it’s properly working. I don’t get what’s wrong here