Nodes did not restart automatically after probable power outage

Hi all! :smiley:

This morning, my uptimerobot was complaining a lot as most of my nodes (but not all of them) stopped working during last night, on my Raspberry Pi 4B (Raspbian 64).

Here is what I checked on one of them:

Latest logs entries before the incident (all seems in order):

2022-10-17T01:46:44.155Z	INFO	piecestore	download started	{"Process": "storagenode", "Piece ID": "GGEEMEBDN2PD2OQFAPD3PZHB6OOIAVNVZJPRJ3BMEJCEMSN5FPBQ", "Satellite ID": "121RTSDpyNZVcEU84Ticf2L1ntiuUimbWgfATz21tuvgk3vzoA6", "Action": "GET"}
2022-10-17T01:46:44.352Z	INFO	piecestore	downloaded	{"Process": "storagenode", "Piece ID": "GGEEMEBDN2PD2OQFAPD3PZHB6OOIAVNVZJPRJ3BMEJCEMSN5FPBQ", "Satellite ID": "121RTSDpyNZVcEU84Ticf2L1ntiuUimbWgfATz21tuvgk3vzoA6", "Action": "GET"}
2022-10-17T01:46:45.571Z	INFO	piecestore	download started	{"Process": "storagenode", "Piece ID": "GGEEMEBDN2PD2OQFAPD3PZHB6OOIAVNVZJPRJ3BMEJCEMSN5FPBQ", "Satellite ID": "121RTSDpyNZVcEU84Ticf2L1ntiuUimbWgfATz21tuvgk3vzoA6", "Action": "GET"}
2022-10-17T01:46:45.711Z	INFO	piecestore	downloaded	{"Process": "storagenode", "Piece ID": "GGEEMEBDN2PD2OQFAPD3PZHB6OOIAVNVZJPRJ3BMEJCEMSN5FPBQ", "Satellite ID": "121RTSDpyNZVcEU84Ticf2L1ntiuUimbWgfATz21tuvgk3vzoA6", "Action": "GET"}
2022-10-17T01:46:46.269Z	INFO	piecestore	download started	{"Process": "storagenode", "Piece ID": "4CG44XNPVQ4JPAFC7JBS2F2UZXGU3GWKYSV2UU3RNO44OWFJJWKA", "Satellite ID": "12EayRS2V1kEsWESU9QMRseFhdxYxKicsiFmxrsLZHeLUtdps3S", "Action": "GET"}
2022-10-17T01:46:47.002Z	INFO	piecestore	downloaded	{"Process": "storagenode", "Piece ID": "4CG44XNPVQ4JPAFC7JBS2F2UZXGU3GWKYSV2UU3RNO44OWFJJWKA", "Satellite ID": "12EayRS2V1kEsWESU9QMRseFhdxYxKicsiFmxrsLZHeLUtdps3S", "Action": "GET"}
2022-10-17T01:46:47.595Z	INFO	piecestore	upload started	{"Process": "storagenode", "Piece ID": "I2QS2I4GMIHUA6JHPES6NDYRFWO62T6UMIFGPJ7MRPPGGRPNE35A", "Satellite ID": "12L9ZFwhzVpuEKMUNUqkaTLGzwY9G24tbiigLiXpmZWKwmcNDDs", "Action": "PUT", "Available Space": 1094477440256}
2022-10-17T01:46:49.811Z	INFO	piecestore	download started	{"Process": "storagenode", "Piece ID": "4CG44XNPVQ4JPAFC7JBS2F2UZXGU3GWKYSV2UU3RNO44OWFJJWKA", "Satellite ID": "12EayRS2V1kEsWESU9QMRseFhdxYxKicsiFmxrsLZHeLUtdps3S", "Action": "GET"}
2022-10-17T01:46:50.644Z	INFO	piecestore	uploaded	{"Process": "storagenode", "Piece ID": "FG2ME4KBNHQR7GIMQVNBAO2F46CHYGUB6Q3BE7PEII5L7FAHJT6Q", "Satellite ID": "12EayRS2V1kEsWESU9QMRseFhdxYxKicsiFmxrsLZHeLUtdps3S", "Action": "PUT", "Size": 73728}
2022-10-17T01:46:50.708Z	INFO	piecestore	uploaded	{"Process": "storagenode", "Piece ID": "I2QS2I4GMIHUA6JHPES6NDYRFWO62T6UMIFGPJ7MRPPGGRPNE35A", "Satellite ID": "12L9ZFwhzVpuEKMUNUqkaTLGzwY9G24tbiigLiXpmZWKwmcNDDs", "Action": "PUT", "Size": 2174464}

Latest docker log entries before the incident (all seems in order):

2022-10-17T01:22:50.389Z	INFO	Downloading versions.	{"Process": "storagenode-updater", "Server Address": "https://version.storj.io"}
2022-10-17T01:22:50.866Z	INFO	Current binary version	{"Process": "storagenode-updater", "Service": "storagenode", "Version": "v1.64.1"}
2022-10-17T01:22:50.866Z	INFO	New version is being rolled out but hasn't made it to this node yet	{"Process": "storagenode-updater", "Service": "storagenode"}
2022-10-17T01:22:50.927Z	INFO	Current binary version	{"Process": "storagenode-updater", "Service": "storagenode-updater", "Version": "v1.64.1"}
2022-10-17T01:22:50.928Z	INFO	New version is being rolled out but hasn't made it to this node yet	{"Process": "storagenode-updater", "Service": "storagenode-updater"}
2022-10-17T01:37:50.388Z	INFO	Downloading versions.	{"Process": "storagenode-updater", "Server Address": "https://version.storj.io"}
2022-10-17T01:37:50.901Z	INFO	Current binary version	{"Process": "storagenode-updater", "Service": "storagenode", "Version": "v1.64.1"}
2022-10-17T01:37:50.901Z	INFO	New version is being rolled out but hasn't made it to this node yet	{"Process": "storagenode-updater", "Service": "storagenode"}
2022-10-17T01:37:50.993Z	INFO	Current binary version	{"Process": "storagenode-updater", "Service": "storagenode-updater", "Version": "v1.64.1"}
2022-10-17T01:37:50.993Z	INFO	New version is being rolled out but hasn't made it to this node yet	{"Process": "storagenode-updater", "Service": "storagenode-updater"}

I see no entries when I run grep -i "killed" /var/log/syslog or grep -i "killed" /var/log/messages

I also noticed my Raspberry Pi rebooted as top shows itā€™s been up for 3:46 so thatā€™s weirdā€¦

So maybe the system rebooted abruptly because of a power-cut. I have no way to be sure, but thatā€™s more than probable because my ISPā€™s box did reboot around the same time too.


So my best guess right know is a power-cut, and that some nodes did not get restarted automatically even though theyā€™re all configured with --restart unless-stopped, which is suspiciousā€¦

Anyway I restarted all of them, and everything seems back to normal for now: Anything else I could check to better understand why docker did not restart some of my nodes?
Cheers :slight_smile:

PS: Iā€™ll take this opportunity to remind new comers that they should use a service like uptimerobot (or similar) for detecting such incidents quickly, thatā€™s really a must-have ^^

Please make sure that docker service is enabled to autostart. I noticed on Ubuntu that the docker service is not enabled by default, it starts the daemon when you invoke any docker command.

sudo systemctl status docker

If itā€™s not enabled, you need to enable it:

sudo systemctl enable docker
2 Likes

Thanks for the suggestion @Alexey :slight_smile:

Seems like it is running:

pac@raspberrypi:~ $ sudo systemctl status docker
ā— docker.service - Docker Application Container Engine
     Loaded: loaded (/lib/systemd/system/docker.service; enabled; vendor preset: enabled)
     Active: active (running) since Mon 2022-10-17 04:23:33 CEST; 8h ago
TriggeredBy: ā— docker.socket
       Docs: https://docs.docker.com
   Main PID: 482 (dockerd)
      Tasks: 200
        CPU: 34.314s
     CGroup: /system.slice/docker.service
             ā”œā”€ 482 /usr/sbin/dockerd -H fd://
             ā”œā”€4025 /usr/sbin/docker-proxy -proto tcp -host-ip 0.0.0.0 -host-port 1167 -container-ip 172.17.0.4 -container-port 28967
             ā”œā”€4038 /usr/sbin/docker-proxy -proto udp -host-ip 0.0.0.0 -host-port 1167 -container-ip 172.17.0.4 -container-port 28967
[...]
             ā”œā”€6899 /usr/sbin/docker-proxy -proto tcp -host-ip 0.0.0.0 -host-port 1867 -container-ip 172.17.0.10 -container-port 28967
             ā””ā”€6912 /usr/sbin/docker-proxy -proto udp -host-ip 0.0.0.0 -host-port 1867 -container-ip 172.17.0.10 -container-port 28967

Oct 17 04:23:33 raspberrypi systemd[1]: Started Docker Application Container Engine.
Oct 17 04:23:33 raspberrypi dockerd[482]: time="2022-10-17T04:23:33.965510605+02:00" level=info msg="API listen on /run/docker.sock"
Oct 17 04:23:46 raspberrypi dockerd[482]: time="2022-10-17T04:23:46.713559431+02:00" level=info msg="ignoring event" container=2777f75cf76fbebc22386643ee439efa7fbe2c12be811952b3b396d87ad82711 module=libcontainerd namespace=moby topic=/t>
Oct 17 04:23:47 raspberrypi dockerd[482]: time="2022-10-17T04:23:47.151514597+02:00" level=info msg="ignoring event" container=53dc64cb118cfb001561e6ece48e24ce92967599bc72acbb978c2b1c8ed38735 module=libcontainerd namespace=moby topic=/t>
[...]
Oct 17 04:24:25 raspberrypi dockerd[482]: time="2022-10-17T04:24:25.083825389+02:00" level=info msg="ignoring event" container=53dc64cb118cfb001561e6ece48e24ce92967599bc72acbb978c2b1c8ed38735 module=libcontainerd namespace=moby topic=/t>
Oct 17 04:24:25 raspberrypi dockerd[482]: time="2022-10-17T04:24:25.797009907+02:00" level=info msg="ignoring event" container=2777f75cf76fbebc22386643ee439efa7fbe2c12be811952b3b396d87ad82711 module=libcontainerd namespace=moby topic=/t>

Besides, if it wasnā€™t I guess none of my nodes would have restarted, whereas in my case 2 of them did restart automatically :thinking:

thatā€™s what I wanted to see. So your docker service is enabled, and it should start.

You may check docker events

docker events --since 2022-10-17T01:37:50.993Z | head -n 30

or check journals

journalctl --since 2022-10-17T01:37:50.993Z | grep dockerd
1 Like
pac@raspberrypi:~ $ sudo docker events --since 2022-10-17T01:37:50.993Z | head -n 30
2022-10-17T04:23:29.854089533+02:00 network connect 1d15b0f1b9aecd5a6f0f341d61743507cb2f286fc22e6b10af47df50a813ea75 (container=30751f1ad8e2d2d3a25f8761ec10c4dc1d891bec2bb4c689a23f405b8f401dc0, name=bridge, type=bridge)
2022-10-17T04:23:29.872125441+02:00 network connect 1d15b0f1b9aecd5a6f0f341d61743507cb2f286fc22e6b10af47df50a813ea75 (container=2777f75cf76fbebc22386643ee439efa7fbe2c12be811952b3b396d87ad82711, name=bridge, type=bridge)
2022-10-17T04:23:30.306856737+02:00 network connect 1d15b0f1b9aecd5a6f0f341d61743507cb2f286fc22e6b10af47df50a813ea75 (container=bef94a31c99296c4c016b9b4f02a55e6e4b0910b8d95b82e3fbde167131627c9, name=bridge, type=bridge)
2022-10-17T04:23:31.343285051+02:00 network connect 1d15b0f1b9aecd5a6f0f341d61743507cb2f286fc22e6b10af47df50a813ea75 (container=cd38de9c32e424b9cee96440877af8b7297d196232bb0daf3f6ca205889bbaf1, name=bridge, type=bridge)
2022-10-17T04:23:31.425450662+02:00 network connect 1d15b0f1b9aecd5a6f0f341d61743507cb2f286fc22e6b10af47df50a813ea75 (container=186a2e99d7368261a613dd29e886a5039e4920ddb893cf85a1d3cd523e79e5cf, name=bridge, type=bridge)
2022-10-17T04:23:31.577251162+02:00 container start 30751f1ad8e2d2d3a25f8761ec10c4dc1d891bec2bb4c689a23f405b8f401dc0 (image=storjlabs/watchtower, io.storj.watchtower=true, name=storj_watchtower)
2022-10-17T04:23:31.668610069+02:00 container start 2777f75cf76fbebc22386643ee439efa7fbe2c12be811952b3b396d87ad82711 (image=storjlabs/storagenode:latest, name=storj_node_2)
2022-10-17T04:23:31.832168069+02:00 network connect 1d15b0f1b9aecd5a6f0f341d61743507cb2f286fc22e6b10af47df50a813ea75 (container=205848e0b86b10944283ad4b266f956b57baca62b626ff6360e85099efb93b80, name=bridge, type=bridge)
2022-10-17T04:23:31.991967032+02:00 network connect 1d15b0f1b9aecd5a6f0f341d61743507cb2f286fc22e6b10af47df50a813ea75 (container=3bee8582c5b4c82c509f662796e55e4e12f028a0f2a9e821dc4d80ce09840a89, name=bridge, type=bridge)
2022-10-17T04:23:32.045416087+02:00 network connect 1d15b0f1b9aecd5a6f0f341d61743507cb2f286fc22e6b10af47df50a813ea75 (container=53dc64cb118cfb001561e6ece48e24ce92967599bc72acbb978c2b1c8ed38735, name=bridge, type=bridge)
2022-10-17T04:23:32.135109513+02:00 network connect 1d15b0f1b9aecd5a6f0f341d61743507cb2f286fc22e6b10af47df50a813ea75 (container=2e329e981ec6fc911e4eb4f455a7f1cfb98516d3391082c11b87bb5a49b383d2, name=bridge, type=bridge)
2022-10-17T04:23:32.512445161+02:00 network disconnect 1d15b0f1b9aecd5a6f0f341d61743507cb2f286fc22e6b10af47df50a813ea75 (container=bef94a31c99296c4c016b9b4f02a55e6e4b0910b8d95b82e3fbde167131627c9, name=bridge, type=bridge)
2022-10-17T04:23:33.039568772+02:00 network disconnect 1d15b0f1b9aecd5a6f0f341d61743507cb2f286fc22e6b10af47df50a813ea75 (container=186a2e99d7368261a613dd29e886a5039e4920ddb893cf85a1d3cd523e79e5cf, name=bridge, type=bridge)
2022-10-17T04:23:33.182106383+02:00 network disconnect 1d15b0f1b9aecd5a6f0f341d61743507cb2f286fc22e6b10af47df50a813ea75 (container=cd38de9c32e424b9cee96440877af8b7297d196232bb0daf3f6ca205889bbaf1, name=bridge, type=bridge)
2022-10-17T04:23:33.551675957+02:00 network disconnect 1d15b0f1b9aecd5a6f0f341d61743507cb2f286fc22e6b10af47df50a813ea75 (container=205848e0b86b10944283ad4b266f956b57baca62b626ff6360e85099efb93b80, name=bridge, type=bridge)
2022-10-17T04:23:33.577883105+02:00 container start 53dc64cb118cfb001561e6ece48e24ce92967599bc72acbb978c2b1c8ed38735 (image=storjlabs/storagenode:latest, name=storj_node_1)
2022-10-17T04:23:33.741195068+02:00 network disconnect 1d15b0f1b9aecd5a6f0f341d61743507cb2f286fc22e6b10af47df50a813ea75 (container=3bee8582c5b4c82c509f662796e55e4e12f028a0f2a9e821dc4d80ce09840a89, name=bridge, type=bridge)
2022-10-17T04:23:33.809243457+02:00 network disconnect 1d15b0f1b9aecd5a6f0f341d61743507cb2f286fc22e6b10af47df50a813ea75 (container=2e329e981ec6fc911e4eb4f455a7f1cfb98516d3391082c11b87bb5a49b383d2, name=bridge, type=bridge)
2022-10-17T04:23:46.714923875+02:00 container die 2777f75cf76fbebc22386643ee439efa7fbe2c12be811952b3b396d87ad82711 (exitCode=0, image=storjlabs/storagenode:latest, name=storj_node_2)
2022-10-17T04:23:46.950395671+02:00 network disconnect 1d15b0f1b9aecd5a6f0f341d61743507cb2f286fc22e6b10af47df50a813ea75 (container=2777f75cf76fbebc22386643ee439efa7fbe2c12be811952b3b396d87ad82711, name=bridge, type=bridge)
2022-10-17T04:23:47.153467023+02:00 container die 53dc64cb118cfb001561e6ece48e24ce92967599bc72acbb978c2b1c8ed38735 (exitCode=0, image=storjlabs/storagenode:latest, name=storj_node_1)
2022-10-17T04:23:47.231422615+02:00 network connect 1d15b0f1b9aecd5a6f0f341d61743507cb2f286fc22e6b10af47df50a813ea75 (container=2777f75cf76fbebc22386643ee439efa7fbe2c12be811952b3b396d87ad82711, name=bridge, type=bridge)
2022-10-17T04:23:47.493054615+02:00 network disconnect 1d15b0f1b9aecd5a6f0f341d61743507cb2f286fc22e6b10af47df50a813ea75 (container=53dc64cb118cfb001561e6ece48e24ce92967599bc72acbb978c2b1c8ed38735, name=bridge, type=bridge)
2022-10-17T04:23:47.800732282+02:00 network connect 1d15b0f1b9aecd5a6f0f341d61743507cb2f286fc22e6b10af47df50a813ea75 (container=53dc64cb118cfb001561e6ece48e24ce92967599bc72acbb978c2b1c8ed38735, name=bridge, type=bridge)
2022-10-17T04:23:48.009217411+02:00 container start 2777f75cf76fbebc22386643ee439efa7fbe2c12be811952b3b396d87ad82711 (image=storjlabs/storagenode:latest, name=storj_node_2)
2022-10-17T04:23:48.539545892+02:00 container start 53dc64cb118cfb001561e6ece48e24ce92967599bc72acbb978c2b1c8ed38735 (image=storjlabs/storagenode:latest, name=storj_node_1)
2022-10-17T04:23:59.505098516+02:00 container die 2777f75cf76fbebc22386643ee439efa7fbe2c12be811952b3b396d87ad82711 (exitCode=0, image=storjlabs/storagenode:latest, name=storj_node_2)
2022-10-17T04:23:59.739179219+02:00 network disconnect 1d15b0f1b9aecd5a6f0f341d61743507cb2f286fc22e6b10af47df50a813ea75 (container=2777f75cf76fbebc22386643ee439efa7fbe2c12be811952b3b396d87ad82711, name=bridge, type=bridge)
2022-10-17T04:23:59.979080145+02:00 network connect 1d15b0f1b9aecd5a6f0f341d61743507cb2f286fc22e6b10af47df50a813ea75 (container=2777f75cf76fbebc22386643ee439efa7fbe2c12be811952b3b396d87ad82711, name=bridge, type=bridge)
2022-10-17T04:24:00.143441441+02:00 container die 53dc64cb118cfb001561e6ece48e24ce92967599bc72acbb978c2b1c8ed38735 (exitCode=0, image=storjlabs/storagenode:latest, name=storj_node_1)

JournalCTL returns an awful lot of lines when running journalctl --since 2022-10-17 | grep dockerd, so Iā€™ve shortened similar series of lines with [...]:

pac@raspberrypi:~ $ journalctl --since 2022-10-17 | grep dockerd
Oct 17 00:00:18 raspberrypi dockerd[1653]: time="2022-10-17T00:00:18.739895704+02:00" level=info msg="ignoring event" container=53dc64cb118cfb001561e6ece48e24ce92967599bc72acbb978c2b1c8ed38735 module=libcontainerd namespace=moby topic=/tasks/delete type="*events.TaskDelete"
Oct 17 00:01:11 raspberrypi dockerd[1653]: time="2022-10-17T00:01:11.672142338+02:00" level=info msg="ignoring event" container=53dc64cb118cfb001561e6ece48e24ce92967599bc72acbb978c2b1c8ed38735 module=libcontainerd namespace=moby topic=/tasks/delete type="*events.TaskDelete"
[...]
Oct 17 03:27:44 raspberrypi dockerd[1653]: time="2022-10-17T03:27:43.997421023+02:00" level=info msg="ignoring event" container=53dc64cb118cfb001561e6ece48e24ce92967599bc72acbb978c2b1c8ed38735 module=libcontainerd namespace=moby topic=/tasks/delete type="*events.TaskDelete"
Oct 17 03:28:14 raspberrypi dockerd[1653]: time="2022-10-17T03:28:14.942628445+02:00" level=info msg="ignoring event" container=53dc64cb118cfb001561e6ece48e24ce92967599bc72acbb978c2b1c8ed38735 module=libcontainerd namespace=moby topic=/tasks/delete type="*events.TaskDelete"
Oct 17 03:28:19 raspberrypi dockerd[484]: time="2022-10-17T03:28:19.034172165+02:00" level=info msg="Starting up"
Oct 17 03:28:19 raspberrypi dockerd[484]: time="2022-10-17T03:28:19.055609147+02:00" level=info msg="parsed scheme: \"unix\"" module=grpc
Oct 17 03:28:19 raspberrypi dockerd[484]: time="2022-10-17T03:28:19.055739702+02:00" level=info msg="scheme \"unix\" not registered, fallback to default scheme" module=grpc
Oct 17 03:28:19 raspberrypi dockerd[484]: time="2022-10-17T03:28:19.055830998+02:00" level=info msg="ccResolverWrapper: sending update to cc: {[{unix:///run/containerd/containerd.sock  <nil> 0 <nil>}] <nil> <nil>}" module=grpc
Oct 17 03:28:19 raspberrypi dockerd[484]: time="2022-10-17T03:28:19.055869776+02:00" level=info msg="ClientConn switching balancer to \"pick_first\"" module=grpc
Oct 17 03:28:19 raspberrypi dockerd[484]: time="2022-10-17T03:28:19.072909554+02:00" level=info msg="parsed scheme: \"unix\"" module=grpc
Oct 17 03:28:19 raspberrypi dockerd[484]: time="2022-10-17T03:28:19.072987850+02:00" level=info msg="scheme \"unix\" not registered, fallback to default scheme" module=grpc
Oct 17 03:28:19 raspberrypi dockerd[484]: time="2022-10-17T03:28:19.073068610+02:00" level=info msg="ccResolverWrapper: sending update to cc: {[{unix:///run/containerd/containerd.sock  <nil> 0 <nil>}] <nil> <nil>}" module=grpc
Oct 17 03:28:19 raspberrypi dockerd[484]: time="2022-10-17T03:28:19.073119480+02:00" level=info msg="ClientConn switching balancer to \"pick_first\"" module=grpc
Oct 17 03:28:19 raspberrypi dockerd[484]: time="2022-10-17T03:28:19.126779998+02:00" level=info msg="[graphdriver] using prior storage driver: overlay2"
Oct 17 03:28:19 raspberrypi dockerd[484]: time="2022-10-17T03:28:19.360718795+02:00" level=warning msg="Unable to find memory controller"
Oct 17 03:28:19 raspberrypi dockerd[484]: time="2022-10-17T03:28:19.361460980+02:00" level=info msg="Loading containers: start."
Oct 17 03:28:22 raspberrypi dockerd[484]: time="2022-10-17T03:28:22.834300404+02:00" level=info msg="Removing stale sandbox 1f3c9d1cc1606d1c8159a29571a968380bb72b58acf9ff191a4e9cf960b067f0 (30751f1ad8e2d2d3a25f8761ec10c4dc1d891bec2bb4c689a23f405b8f401dc0)"
Oct 17 03:28:22 raspberrypi dockerd[484]: time="2022-10-17T03:28:22.861482052+02:00" level=warning msg="Error (Unable to complete atomic operation, key modified) deleting object [endpoint 47bfd76720eb12a46ab188e7fba66d422221734c517f240dda6c0d8af91f2054 9ad37aac9b055d5a17a19d0e9ca6388c2b96238f5fba3b55ee58d8d42a2d8362], retrying...."
Oct 17 03:28:23 raspberrypi dockerd[484]: time="2022-10-17T03:28:23.376477292+02:00" level=info msg="Removing stale sandbox 3a008e410af3c43f3c364d254301fb28d9aa800b485e67236e3013f23bd74307 (186a2e99d7368261a613dd29e886a5039e4920ddb893cf85a1d3cd523e79e5cf)"
Oct 17 03:28:23 raspberrypi dockerd[484]: time="2022-10-17T03:28:23.526962366+02:00" level=warning msg="Error (Unable to complete atomic operation, key modified) deleting object [endpoint 47bfd76720eb12a46ab188e7fba66d422221734c517f240dda6c0d8af91f2054 b43d111298b4b212742ef226f9be2cba6065454d00a38137396f023c4fbec38b], retrying...."
[...]
Oct 17 03:28:26 raspberrypi dockerd[484]: time="2022-10-17T03:28:26.964596216+02:00" level=info msg="Removing stale sandbox 71880c7ebc95053c27b438b4ba4d9db1ba0cd04a3aa2773c76c03d4eed7c272f (2e329e981ec6fc911e4eb4f455a7f1cfb98516d3391082c11b87bb5a49b383d2)"
Oct 17 03:28:27 raspberrypi dockerd[484]: time="2022-10-17T03:28:27.086973142+02:00" level=warning msg="Error (Unable to complete atomic operation, key modified) deleting object [endpoint 47bfd76720eb12a46ab188e7fba66d422221734c517f240dda6c0d8af91f2054 c42973fb4f057698d0c984ac1536caea0b63f919ef75198e0ad56e56975aaffe], retrying...."
Oct 17 03:28:27 raspberrypi dockerd[484]: time="2022-10-17T03:28:27.597579919+02:00" level=info msg="Removing stale sandbox 810b288b9b491fe2a625525c1756dd309d38bd7afd276cb0f1e5b9b76c6b088d (2777f75cf76fbebc22386643ee439efa7fbe2c12be811952b3b396d87ad82711)"
Oct 17 03:28:27 raspberrypi dockerd[484]: time="2022-10-17T03:28:27.727215753+02:00" level=warning msg="Error (Unable to complete atomic operation, key modified) deleting object [endpoint 47bfd76720eb12a46ab188e7fba66d422221734c517f240dda6c0d8af91f2054 6a83a2a2d8fa332840159ffbe5aeabf386918048950db21d473470592555b0ae], retrying...."
Oct 17 03:28:27 raspberrypi dockerd[484]: time="2022-10-17T03:28:27.879209641+02:00" level=info msg="Default bridge (docker0) is assigned with an IP address 172.17.0.0/16. Daemon option --bip can be used to set a preferred IP address"
Oct 17 03:28:29 raspberrypi dockerd[484]: time="2022-10-17T03:28:29.226079103+02:00" level=error msg="stream copy error: reading from a closed fifo"
Oct 17 03:28:29 raspberrypi dockerd[484]: time="2022-10-17T03:28:29.236504437+02:00" level=error msg="stream copy error: reading from a closed fifo"
[...]
Oct 17 03:28:31 raspberrypi dockerd[484]: time="2022-10-17T03:28:31.019944214+02:00" level=error msg="stream copy error: reading from a closed fifo"
Oct 17 03:28:31 raspberrypi dockerd[484]: time="2022-10-17T03:28:31.037533028+02:00" level=error msg="stream copy error: reading from a closed fifo"
Oct 17 03:28:31 raspberrypi dockerd[484]: time="2022-10-17T03:28:31.062835380+02:00" level=error msg="186a2e99d7368261a613dd29e886a5039e4920ddb893cf85a1d3cd523e79e5cf cleanup: failed to delete container from containerd: no such container"
Oct 17 03:28:31 raspberrypi dockerd[484]: time="2022-10-17T03:28:31.062919065+02:00" level=error msg="failed to start container" container=186a2e99d7368261a613dd29e886a5039e4920ddb893cf85a1d3cd523e79e5cf error="OCI runtime create failed: container_linux.go:367: starting container process caused: process_linux.go:495: container init caused: rootfs_linux.go:76: mounting \"/home/pac/storj/mounts/disk_3/storj_node_3\" to rootfs at \"/app/config\" caused: stat /home/pac/storj/mounts/disk_3/storj_node_3: no such file or directory: unknown"
Oct 17 03:28:31 raspberrypi dockerd[484]: time="2022-10-17T03:28:31.273435510+02:00" level=error msg="2e329e981ec6fc911e4eb4f455a7f1cfb98516d3391082c11b87bb5a49b383d2 cleanup: failed to delete container from containerd: no such container"
Oct 17 03:28:31 raspberrypi dockerd[484]: time="2022-10-17T03:28:31.273525843+02:00" level=error msg="failed to start container" container=2e329e981ec6fc911e4eb4f455a7f1cfb98516d3391082c11b87bb5a49b383d2 error="OCI runtime create failed: container_linux.go:367: starting container process caused: process_linux.go:495: container init caused: rootfs_linux.go:76: mounting \"/home/pac/storj/mounts/disk_3/storj_node_5\" to rootfs at \"/app/config\" caused: stat /home/pac/storj/mounts/disk_3/storj_node_5: no such file or directory: unknown"
[...]
Oct 17 03:28:31 raspberrypi dockerd[484]: time="2022-10-17T03:28:31.983878806+02:00" level=error msg="cd38de9c32e424b9cee96440877af8b7297d196232bb0daf3f6ca205889bbaf1 cleanup: failed to delete container from containerd: no such container"
Oct 17 03:28:31 raspberrypi dockerd[484]: time="2022-10-17T03:28:31.983959454+02:00" level=error msg="failed to start container" container=cd38de9c32e424b9cee96440877af8b7297d196232bb0daf3f6ca205889bbaf1 error="OCI runtime create failed: container_linux.go:367: starting container process caused: process_linux.go:495: container init caused: rootfs_linux.go:76: mounting \"/home/pac/storj/mounts/disk_4/storj_node_8\" to rootfs at \"/app/config\" caused: stat /home/pac/storj/mounts/disk_4/storj_node_8: no such file or directory: unknown"
Oct 17 03:28:32 raspberrypi dockerd[484]: time="2022-10-17T03:28:32.020281713+02:00" level=error msg="53dc64cb118cfb001561e6ece48e24ce92967599bc72acbb978c2b1c8ed38735 cleanup: failed to delete container from containerd: no such container"
Oct 17 03:28:32 raspberrypi dockerd[484]: time="2022-10-17T03:28:32.020417694+02:00" level=error msg="failed to start container" container=53dc64cb118cfb001561e6ece48e24ce92967599bc72acbb978c2b1c8ed38735 error="OCI runtime create failed: container_linux.go:367: starting container process caused: process_linux.go:495: container init caused: rootfs_linux.go:76: mounting \"/home/pac/storj/mounts/disk_1/storj_node_1\" to rootfs at \"/app/config\" caused: stat /home/pac/storj/mounts/disk_1/storj_node_1: no such file or directory: unknown"
Oct 17 03:28:32 raspberrypi dockerd[484]: time="2022-10-17T03:28:32.020530694+02:00" level=info msg="Loading containers: done."
Oct 17 03:28:32 raspberrypi dockerd[484]: time="2022-10-17T03:28:32.093915102+02:00" level=info msg="Docker daemon" commit=363e9a8 graphdriver(s)=overlay2 version=20.10.5+dfsg1
Oct 17 03:28:32 raspberrypi dockerd[484]: time="2022-10-17T03:28:32.096463028+02:00" level=info msg="Daemon has completed initialization"
Oct 17 03:28:32 raspberrypi dockerd[484]: time="2022-10-17T03:28:32.174269620+02:00" level=info msg="API listen on /run/docker.sock"
Oct 17 03:28:44 raspberrypi dockerd[484]: time="2022-10-17T03:28:44.171226872+02:00" level=info msg="ignoring event" container=2777f75cf76fbebc22386643ee439efa7fbe2c12be811952b3b396d87ad82711 module=libcontainerd namespace=moby topic=/tasks/delete type="*events.TaskDelete"
Oct 17 03:28:45 raspberrypi dockerd[1653]: time="2022-10-17T03:28:45.635255181+02:00" level=info msg="ignoring event" container=53dc64cb118cfb001561e6ece48e24ce92967599bc72acbb978c2b1c8ed38735 module=libcontainerd namespace=moby topic=/tasks/delete type="*events.TaskDelete"
[...]
Oct 17 03:45:20 raspberrypi dockerd[1653]: time="2022-10-17T03:45:20.820508110+02:00" level=info msg="ignoring event" container=53dc64cb118cfb001561e6ece48e24ce92967599bc72acbb978c2b1c8ed38735 module=libcontainerd namespace=moby topic=/tasks/delete type="*events.TaskDelete"
Oct 17 03:46:11 raspberrypi dockerd[1653]: time="2022-10-17T03:46:11.036274564+02:00" level=info msg="ignoring event" container=53dc64cb118cfb001561e6ece48e24ce92967599bc72acbb978c2b1c8ed38735 module=libcontainerd namespace=moby topic=/tasks/delete type="*events.TaskDelete"
Oct 17 04:23:26 raspberrypi dockerd[482]: time="2022-10-17T04:23:26.032939850+02:00" level=info msg="Starting up"
Oct 17 04:23:26 raspberrypi dockerd[482]: time="2022-10-17T04:23:26.046793424+02:00" level=info msg="parsed scheme: \"unix\"" module=grpc
Oct 17 04:23:26 raspberrypi dockerd[482]: time="2022-10-17T04:23:26.046891350+02:00" level=info msg="scheme \"unix\" not registered, fallback to default scheme" module=grpc
Oct 17 04:23:26 raspberrypi dockerd[482]: time="2022-10-17T04:23:26.046979202+02:00" level=info msg="ccResolverWrapper: sending update to cc: {[{unix:///run/containerd/containerd.sock  <nil> 0 <nil>}] <nil> <nil>}" module=grpc
Oct 17 04:23:26 raspberrypi dockerd[482]: time="2022-10-17T04:23:26.047030424+02:00" level=info msg="ClientConn switching balancer to \"pick_first\"" module=grpc
Oct 17 04:23:26 raspberrypi dockerd[482]: time="2022-10-17T04:23:26.057833517+02:00" level=info msg="parsed scheme: \"unix\"" module=grpc
Oct 17 04:23:26 raspberrypi dockerd[482]: time="2022-10-17T04:23:26.057962924+02:00" level=info msg="scheme \"unix\" not registered, fallback to default scheme" module=grpc
Oct 17 04:23:26 raspberrypi dockerd[482]: time="2022-10-17T04:23:26.058065480+02:00" level=info msg="ccResolverWrapper: sending update to cc: {[{unix:///run/containerd/containerd.sock  <nil> 0 <nil>}] <nil> <nil>}" module=grpc
Oct 17 04:23:26 raspberrypi dockerd[482]: time="2022-10-17T04:23:26.058123258+02:00" level=info msg="ClientConn switching balancer to \"pick_first\"" module=grpc
Oct 17 04:23:26 raspberrypi dockerd[482]: time="2022-10-17T04:23:26.138001128+02:00" level=info msg="[graphdriver] using prior storage driver: overlay2"
Oct 17 04:23:26 raspberrypi dockerd[482]: time="2022-10-17T04:23:26.345418943+02:00" level=warning msg="Unable to find memory controller"
Oct 17 04:23:26 raspberrypi dockerd[482]: time="2022-10-17T04:23:26.346039239+02:00" level=info msg="Loading containers: start."
Oct 17 04:23:28 raspberrypi dockerd[482]: time="2022-10-17T04:23:28.459194423+02:00" level=info msg="Removing stale sandbox 3207d5d8bc853e20dec70c61ce3765e0417616f7cc1840472b70a2f6bd0ad9e4 (30751f1ad8e2d2d3a25f8761ec10c4dc1d891bec2bb4c689a23f405b8f401dc0)"
Oct 17 04:23:28 raspberrypi dockerd[482]: time="2022-10-17T04:23:28.490711201+02:00" level=warning msg="Error (Unable to complete atomic operation, key modified) deleting object [endpoint a827ee56de00a6e5efe0018147d06a94c7c5550d5069b523cb80420bf5bcedde a6f51e474946570731eaa34744dfd6a7245f51ffe8c83376c8f675954b7c6d54], retrying...."
Oct 17 04:23:29 raspberrypi dockerd[482]: time="2022-10-17T04:23:29.009590274+02:00" level=info msg="Removing stale sandbox fb0e1ca547bbd60ffd3633fda12b19f1da44ea2b9cd4e664fa920772307b9339 (2777f75cf76fbebc22386643ee439efa7fbe2c12be811952b3b396d87ad82711)"
Oct 17 04:23:29 raspberrypi dockerd[482]: time="2022-10-17T04:23:29.121989885+02:00" level=warning msg="Error (Unable to complete atomic operation, key modified) deleting object [endpoint a827ee56de00a6e5efe0018147d06a94c7c5550d5069b523cb80420bf5bcedde dc3409e5fcebb93d8f84f5c59dcd4fc77d115bacd02be3d0911a4912364c98e8], retrying...."
Oct 17 04:23:29 raspberrypi dockerd[482]: time="2022-10-17T04:23:29.257495311+02:00" level=info msg="Default bridge (docker0) is assigned with an IP address 172.17.0.0/16. Daemon option --bip can be used to set a preferred IP address"
Oct 17 04:23:30 raspberrypi dockerd[482]: time="2022-10-17T04:23:30.808259625+02:00" level=error msg="stream copy error: reading from a closed fifo"
Oct 17 04:23:30 raspberrypi dockerd[482]: time="2022-10-17T04:23:30.811096959+02:00" level=error msg="stream copy error: reading from a closed fifo"
[...]
Oct 17 04:23:32 raspberrypi dockerd[482]: time="2022-10-17T04:23:32.545451735+02:00" level=error msg="stream copy error: reading from a closed fifo"
Oct 17 04:23:32 raspberrypi dockerd[482]: time="2022-10-17T04:23:32.546036735+02:00" level=error msg="stream copy error: reading from a closed fifo"
Oct 17 04:23:32 raspberrypi dockerd[482]: time="2022-10-17T04:23:32.602075161+02:00" level=error msg="bef94a31c99296c4c016b9b4f02a55e6e4b0910b8d95b82e3fbde167131627c9 cleanup: failed to delete container from containerd: no such container"
Oct 17 04:23:32 raspberrypi dockerd[482]: time="2022-10-17T04:23:32.602301957+02:00" level=error msg="failed to start container" container=bef94a31c99296c4c016b9b4f02a55e6e4b0910b8d95b82e3fbde167131627c9 error="OCI runtime create failed: container_linux.go:367: starting container process caused: process_linux.go:495: container init caused: rootfs_linux.go:76: mounting \"/home/pac/storj/mounts/disk_4/storj_node_4\" to rootfs at \"/app/config\" caused: stat /home/pac/storj/mounts/disk_4/storj_node_4: no such file or directory: unknown"
Oct 17 04:23:32 raspberrypi dockerd[482]: time="2022-10-17T04:23:32.619041828+02:00" level=error msg="stream copy error: reading from a closed fifo"
Oct 17 04:23:32 raspberrypi dockerd[482]: time="2022-10-17T04:23:32.621576957+02:00" level=error msg="stream copy error: reading from a closed fifo"
Oct 17 04:23:33 raspberrypi dockerd[482]: time="2022-10-17T04:23:33.064925846+02:00" level=error msg="186a2e99d7368261a613dd29e886a5039e4920ddb893cf85a1d3cd523e79e5cf cleanup: failed to delete container from containerd: no such container"
Oct 17 04:23:33 raspberrypi dockerd[482]: time="2022-10-17T04:23:33.065011142+02:00" level=error msg="failed to start container" container=186a2e99d7368261a613dd29e886a5039e4920ddb893cf85a1d3cd523e79e5cf error="OCI runtime create failed: container_linux.go:367: starting container process caused: process_linux.go:495: container init caused: rootfs_linux.go:76: mounting \"/home/pac/storj/mounts/disk_3/storj_node_3\" to rootfs at \"/app/config\" caused: stat /home/pac/storj/mounts/disk_3/storj_node_3: no such file or directory: unknown"
Oct 17 04:23:33 raspberrypi dockerd[482]: time="2022-10-17T04:23:33.231030698+02:00" level=error msg="cd38de9c32e424b9cee96440877af8b7297d196232bb0daf3f6ca205889bbaf1 cleanup: failed to delete container from containerd: no such container"
Oct 17 04:23:33 raspberrypi dockerd[482]: time="2022-10-17T04:23:33.231134050+02:00" level=error msg="failed to start container" container=cd38de9c32e424b9cee96440877af8b7297d196232bb0daf3f6ca205889bbaf1 error="OCI runtime create failed: container_linux.go:367: starting container process caused: process_linux.go:495: container init caused: rootfs_linux.go:76: mounting \"/home/pac/storj/mounts/disk_4/storj_node_8\" to rootfs at \"/app/config\" caused: stat /home/pac/storj/mounts/disk_4/storj_node_8: no such file or directory: unknown"
[...]
Oct 17 04:23:33 raspberrypi dockerd[482]: time="2022-10-17T04:23:33.770762753+02:00" level=error msg="3bee8582c5b4c82c509f662796e55e4e12f028a0f2a9e821dc4d80ce09840a89 cleanup: failed to delete container from containerd: no such container"
Oct 17 04:23:33 raspberrypi dockerd[482]: time="2022-10-17T04:23:33.770890068+02:00" level=error msg="failed to start container" container=3bee8582c5b4c82c509f662796e55e4e12f028a0f2a9e821dc4d80ce09840a89 error="OCI runtime create failed: container_linux.go:367: starting container process caused: process_linux.go:495: container init caused: rootfs_linux.go:76: mounting \"/home/pac/storj/mounts/disk_4/storj_node_6\" to rootfs at \"/app/config\" caused: stat /home/pac/storj/mounts/disk_4/storj_node_6: no such file or directory: unknown"
Oct 17 04:23:33 raspberrypi dockerd[482]: time="2022-10-17T04:23:33.826761142+02:00" level=error msg="2e329e981ec6fc911e4eb4f455a7f1cfb98516d3391082c11b87bb5a49b383d2 cleanup: failed to delete container from containerd: no such container"
Oct 17 04:23:33 raspberrypi dockerd[482]: time="2022-10-17T04:23:33.826861790+02:00" level=error msg="failed to start container" container=2e329e981ec6fc911e4eb4f455a7f1cfb98516d3391082c11b87bb5a49b383d2 error="OCI runtime create failed: container_linux.go:367: starting container process caused: process_linux.go:495: container init caused: rootfs_linux.go:76: mounting \"/home/pac/storj/mounts/disk_3/storj_node_5\" to rootfs at \"/app/config\" caused: stat /home/pac/storj/mounts/disk_3/storj_node_5: no such file or directory: unknown"
Oct 17 04:23:33 raspberrypi dockerd[482]: time="2022-10-17T04:23:33.826966623+02:00" level=info msg="Loading containers: done."
Oct 17 04:23:33 raspberrypi dockerd[482]: time="2022-10-17T04:23:33.884556660+02:00" level=info msg="Docker daemon" commit=363e9a8 graphdriver(s)=overlay2 version=20.10.5+dfsg1
Oct 17 04:23:33 raspberrypi dockerd[482]: time="2022-10-17T04:23:33.885653105+02:00" level=info msg="Daemon has completed initialization"
Oct 17 04:23:33 raspberrypi dockerd[482]: time="2022-10-17T04:23:33.965510605+02:00" level=info msg="API listen on /run/docker.sock"
Oct 17 04:23:46 raspberrypi dockerd[482]: time="2022-10-17T04:23:46.713559431+02:00" level=info msg="ignoring event" container=2777f75cf76fbebc22386643ee439efa7fbe2c12be811952b3b396d87ad82711 module=libcontainerd namespace=moby topic=/tasks/delete type="*events.TaskDelete"
Oct 17 04:23:47 raspberrypi dockerd[482]: time="2022-10-17T04:23:47.151514597+02:00" level=info msg="ignoring event" container=53dc64cb118cfb001561e6ece48e24ce92967599bc72acbb978c2b1c8ed38735 module=libcontainerd namespace=moby topic=/tasks/delete type="*events.TaskDelete"
[...]
Oct 17 04:24:25 raspberrypi dockerd[482]: time="2022-10-17T04:24:25.083825389+02:00" level=info msg="ignoring event" container=53dc64cb118cfb001561e6ece48e24ce92967599bc72acbb978c2b1c8ed38735 module=libcontainerd namespace=moby topic=/tasks/delete type="*events.TaskDelete"
Oct 17 04:24:25 raspberrypi dockerd[482]: time="2022-10-17T04:24:25.797009907+02:00" level=info msg="ignoring event" container=2777f75cf76fbebc22386643ee439efa7fbe2c12be811952b3b396d87ad82711 module=libcontainerd namespace=moby topic=/tasks/delete type="*events.TaskDelete"

It seems your disk was not ready when the docker daemon started.
How did you mount your disks?

1 Like

Theyā€™re all declared in /etc/fstab (4 disks), all configured like so:

# STORJ
# Toshiba 2.5" 2TB (disk 1)
UUID=71e83004-16a0-4fb0-8057-f37fe3a34abc /home/pac/storj/mounts/disk_1 auto defaults,noatime,nofail 0 2
# HGST 2.5" 1TB (disk 2)
UUID=d0431e79-ecc0-48bc-873a-b22e74deba8c /home/pac/storj/mounts/disk_2 auto defaults,noatime,nofail 0 2
[...]

Could it be that after a sudden system restart some kind of disk checks got run on some of the disks, preventing them from being accessible until the check finished?

Whatā€™s weird is that all nodes (which means all disks) did show some errors like the one you pointed out: ā€œno such file or directoryā€.
Which suggests that disks 1 & 2 did take some time to start up too, but were quick enough for other retries to work down the line, whereas disks 3 & 4 did take too long? :thinking:

(Indeed nodes 1 & 2 [on disks 1 & 2] did restart correctly and automatically in the end, where all the other ones failed)

Hmmā€¦ so, ā€¦ as it was kind of getting hotter than usual I opened my RPiā€™s box to check whether its fan was still operating normally, and Iā€™m not sure what I did but I shouldnā€™t have touched it apparently coz that completely froze the poor thing.

It was no more reachable via SSH even after a forced reboot, so I ended up connecting a screen to it to check whatā€™s up and I donā€™t like what I see:

The boot sequence is now stuck there, looping forever as if the USB controller was unable to find any disk?
Note: This RPi is configured to boot from USB (then SD card as a fallback - but no SD card is installed).

So maybe the strange issue I faced initially in this thread was just the first sign of a hardware failure? Iā€™ll have to investigate further this weekend but Iā€™m not optimistic I must say ^^ā€™

Please check cables, USB hub, power supply.
Perhaps poweroff and disconnect from the grid for several minutes could be enough to ressurect it.

1 Like

Thanks again @Alexey for your support.
I tried turning it off for the whole night. Did not restart normally this morning. Tried unplugging everything, and plugging all back again, did not work.
image


I ended up swapping my whole RPi 4B motherboard with the one a friend of mine had laying around unused and all my setup came back from the dead immediately! Itā€™s now back online and working normally! (I just had to re-setup a static address dedicated to this new RPiā€™s MAC address so port forwarding works correctly)

So I guess some part of my RPiā€™s hardware kinda died! oO
It had been working non-stop for 3 years, Iā€™m not sure if itā€™s a long time or not for a RPi which is certainly not a piece of hardware supposed to be working 24/7, but computers usually last longer than thatā€¦

Iā€™ll carry out some more testing this weekend, reboot it with a fresh SD card and see whatā€™s going on, maybe update the firmware who knowsā€¦ But for now, it really seems like itā€™s a hardware issue with my RPi.

3 Likes

Wow just died like that. How do you power your Pi? Adapter, psu?

I had some issue in the past with usb connections (connectors) that sounded similar like yours. For other people ending up here: tiewrap your usb-sata connectors. Even the slightest movements can cause connection loss. And only restored by reboot or phisical out/in of the usb cable. Even better, buy some type of raid enclosure.

Another time I didnā€™t plugin the lan cable after maintainanceā€¦

Well, not exactly ā€œjust like thatā€, as I did open its case to check on the fan because I was wondering if it was still running as the Pi was becoming unusually hot (near 70Ā°C, where it usually stays below 60Ā°C). So I guess the lesson here is never to open a computer while itā€™s powered onā€¦ although I did it millions of times without issues beforeā€¦ static electricity fried it maybe, or whatnotā€¦
I wouldnā€™t have been surprised if tampering with the Pi would have restarted it or powered it off. But I really did not think touching it could break it :frowning:

Or maybe it was already kinda dead (it was behaving in a weird way since a few days) and touching it just triggered its final breathā€¦ I dunno.

I did run a few tests this WE and tried flashing an SD card (which used to work with an RPi 4B) with a fresh version of Pi OS 64, with the official tool from raspberrypi.com but I could not get past the rainbow screen.

  • With no SD card: shows the never-ending loop searching for an SD card and a USB device to boot from (as shown in one of my previous posts)
  • With the SD card: Stuck on the rainbow screen and 7 flashes of the green light, telling me it cannot find the kernel image. I checked by plugging the SD card to my laptop: the kernel image file is on the boot partitionā€¦

Soā€¦ Iā€™m not sure what else to try, if the thing cannot read the SD card :confused:


My RPi 4B used the official power adapter. No PSU though. And youā€™re right, USB connections can be unstableā€¦

1 Like