Nodes did not restart automatically after probable power outage

Hi all! :smiley:

This morning, my uptimerobot was complaining a lot as most of my nodes (but not all of them) stopped working during last night, on my Raspberry Pi 4B (Raspbian 64).

Here is what I checked on one of them:

Latest logs entries before the incident (all seems in order):

2022-10-17T01:46:44.155Z	INFO	piecestore	download started	{"Process": "storagenode", "Piece ID": "GGEEMEBDN2PD2OQFAPD3PZHB6OOIAVNVZJPRJ3BMEJCEMSN5FPBQ", "Satellite ID": "121RTSDpyNZVcEU84Ticf2L1ntiuUimbWgfATz21tuvgk3vzoA6", "Action": "GET"}
2022-10-17T01:46:44.352Z	INFO	piecestore	downloaded	{"Process": "storagenode", "Piece ID": "GGEEMEBDN2PD2OQFAPD3PZHB6OOIAVNVZJPRJ3BMEJCEMSN5FPBQ", "Satellite ID": "121RTSDpyNZVcEU84Ticf2L1ntiuUimbWgfATz21tuvgk3vzoA6", "Action": "GET"}
2022-10-17T01:46:45.571Z	INFO	piecestore	download started	{"Process": "storagenode", "Piece ID": "GGEEMEBDN2PD2OQFAPD3PZHB6OOIAVNVZJPRJ3BMEJCEMSN5FPBQ", "Satellite ID": "121RTSDpyNZVcEU84Ticf2L1ntiuUimbWgfATz21tuvgk3vzoA6", "Action": "GET"}
2022-10-17T01:46:45.711Z	INFO	piecestore	downloaded	{"Process": "storagenode", "Piece ID": "GGEEMEBDN2PD2OQFAPD3PZHB6OOIAVNVZJPRJ3BMEJCEMSN5FPBQ", "Satellite ID": "121RTSDpyNZVcEU84Ticf2L1ntiuUimbWgfATz21tuvgk3vzoA6", "Action": "GET"}
2022-10-17T01:46:46.269Z	INFO	piecestore	download started	{"Process": "storagenode", "Piece ID": "4CG44XNPVQ4JPAFC7JBS2F2UZXGU3GWKYSV2UU3RNO44OWFJJWKA", "Satellite ID": "12EayRS2V1kEsWESU9QMRseFhdxYxKicsiFmxrsLZHeLUtdps3S", "Action": "GET"}
2022-10-17T01:46:47.002Z	INFO	piecestore	downloaded	{"Process": "storagenode", "Piece ID": "4CG44XNPVQ4JPAFC7JBS2F2UZXGU3GWKYSV2UU3RNO44OWFJJWKA", "Satellite ID": "12EayRS2V1kEsWESU9QMRseFhdxYxKicsiFmxrsLZHeLUtdps3S", "Action": "GET"}
2022-10-17T01:46:47.595Z	INFO	piecestore	upload started	{"Process": "storagenode", "Piece ID": "I2QS2I4GMIHUA6JHPES6NDYRFWO62T6UMIFGPJ7MRPPGGRPNE35A", "Satellite ID": "12L9ZFwhzVpuEKMUNUqkaTLGzwY9G24tbiigLiXpmZWKwmcNDDs", "Action": "PUT", "Available Space": 1094477440256}
2022-10-17T01:46:49.811Z	INFO	piecestore	download started	{"Process": "storagenode", "Piece ID": "4CG44XNPVQ4JPAFC7JBS2F2UZXGU3GWKYSV2UU3RNO44OWFJJWKA", "Satellite ID": "12EayRS2V1kEsWESU9QMRseFhdxYxKicsiFmxrsLZHeLUtdps3S", "Action": "GET"}
2022-10-17T01:46:50.644Z	INFO	piecestore	uploaded	{"Process": "storagenode", "Piece ID": "FG2ME4KBNHQR7GIMQVNBAO2F46CHYGUB6Q3BE7PEII5L7FAHJT6Q", "Satellite ID": "12EayRS2V1kEsWESU9QMRseFhdxYxKicsiFmxrsLZHeLUtdps3S", "Action": "PUT", "Size": 73728}
2022-10-17T01:46:50.708Z	INFO	piecestore	uploaded	{"Process": "storagenode", "Piece ID": "I2QS2I4GMIHUA6JHPES6NDYRFWO62T6UMIFGPJ7MRPPGGRPNE35A", "Satellite ID": "12L9ZFwhzVpuEKMUNUqkaTLGzwY9G24tbiigLiXpmZWKwmcNDDs", "Action": "PUT", "Size": 2174464}

Latest docker log entries before the incident (all seems in order):

2022-10-17T01:22:50.389Z	INFO	Downloading versions.	{"Process": "storagenode-updater", "Server Address": "https://version.storj.io"}
2022-10-17T01:22:50.866Z	INFO	Current binary version	{"Process": "storagenode-updater", "Service": "storagenode", "Version": "v1.64.1"}
2022-10-17T01:22:50.866Z	INFO	New version is being rolled out but hasn't made it to this node yet	{"Process": "storagenode-updater", "Service": "storagenode"}
2022-10-17T01:22:50.927Z	INFO	Current binary version	{"Process": "storagenode-updater", "Service": "storagenode-updater", "Version": "v1.64.1"}
2022-10-17T01:22:50.928Z	INFO	New version is being rolled out but hasn't made it to this node yet	{"Process": "storagenode-updater", "Service": "storagenode-updater"}
2022-10-17T01:37:50.388Z	INFO	Downloading versions.	{"Process": "storagenode-updater", "Server Address": "https://version.storj.io"}
2022-10-17T01:37:50.901Z	INFO	Current binary version	{"Process": "storagenode-updater", "Service": "storagenode", "Version": "v1.64.1"}
2022-10-17T01:37:50.901Z	INFO	New version is being rolled out but hasn't made it to this node yet	{"Process": "storagenode-updater", "Service": "storagenode"}
2022-10-17T01:37:50.993Z	INFO	Current binary version	{"Process": "storagenode-updater", "Service": "storagenode-updater", "Version": "v1.64.1"}
2022-10-17T01:37:50.993Z	INFO	New version is being rolled out but hasn't made it to this node yet	{"Process": "storagenode-updater", "Service": "storagenode-updater"}

I see no entries when I run grep -i "killed" /var/log/syslog or grep -i "killed" /var/log/messages

I also noticed my Raspberry Pi rebooted as top shows it’s been up for 3:46 so that’s weird…

So maybe the system rebooted abruptly because of a power-cut. I have no way to be sure, but that’s more than probable because my ISP’s box did reboot around the same time too.


So my best guess right know is a power-cut, and that some nodes did not get restarted automatically even though they’re all configured with --restart unless-stopped, which is suspicious…

Anyway I restarted all of them, and everything seems back to normal for now: Anything else I could check to better understand why docker did not restart some of my nodes?
Cheers :slight_smile:

PS: I’ll take this opportunity to remind new comers that they should use a service like uptimerobot (or similar) for detecting such incidents quickly, that’s really a must-have ^^

Please make sure that docker service is enabled to autostart. I noticed on Ubuntu that the docker service is not enabled by default, it starts the daemon when you invoke any docker command.

sudo systemctl status docker

If it’s not enabled, you need to enable it:

sudo systemctl enable docker
2 Likes

Thanks for the suggestion @Alexey :slight_smile:

Seems like it is running:

pac@raspberrypi:~ $ sudo systemctl status docker
● docker.service - Docker Application Container Engine
     Loaded: loaded (/lib/systemd/system/docker.service; enabled; vendor preset: enabled)
     Active: active (running) since Mon 2022-10-17 04:23:33 CEST; 8h ago
TriggeredBy: ● docker.socket
       Docs: https://docs.docker.com
   Main PID: 482 (dockerd)
      Tasks: 200
        CPU: 34.314s
     CGroup: /system.slice/docker.service
             ├─ 482 /usr/sbin/dockerd -H fd://
             ├─4025 /usr/sbin/docker-proxy -proto tcp -host-ip 0.0.0.0 -host-port 1167 -container-ip 172.17.0.4 -container-port 28967
             ├─4038 /usr/sbin/docker-proxy -proto udp -host-ip 0.0.0.0 -host-port 1167 -container-ip 172.17.0.4 -container-port 28967
[...]
             ├─6899 /usr/sbin/docker-proxy -proto tcp -host-ip 0.0.0.0 -host-port 1867 -container-ip 172.17.0.10 -container-port 28967
             └─6912 /usr/sbin/docker-proxy -proto udp -host-ip 0.0.0.0 -host-port 1867 -container-ip 172.17.0.10 -container-port 28967

Oct 17 04:23:33 raspberrypi systemd[1]: Started Docker Application Container Engine.
Oct 17 04:23:33 raspberrypi dockerd[482]: time="2022-10-17T04:23:33.965510605+02:00" level=info msg="API listen on /run/docker.sock"
Oct 17 04:23:46 raspberrypi dockerd[482]: time="2022-10-17T04:23:46.713559431+02:00" level=info msg="ignoring event" container=2777f75cf76fbebc22386643ee439efa7fbe2c12be811952b3b396d87ad82711 module=libcontainerd namespace=moby topic=/t>
Oct 17 04:23:47 raspberrypi dockerd[482]: time="2022-10-17T04:23:47.151514597+02:00" level=info msg="ignoring event" container=53dc64cb118cfb001561e6ece48e24ce92967599bc72acbb978c2b1c8ed38735 module=libcontainerd namespace=moby topic=/t>
[...]
Oct 17 04:24:25 raspberrypi dockerd[482]: time="2022-10-17T04:24:25.083825389+02:00" level=info msg="ignoring event" container=53dc64cb118cfb001561e6ece48e24ce92967599bc72acbb978c2b1c8ed38735 module=libcontainerd namespace=moby topic=/t>
Oct 17 04:24:25 raspberrypi dockerd[482]: time="2022-10-17T04:24:25.797009907+02:00" level=info msg="ignoring event" container=2777f75cf76fbebc22386643ee439efa7fbe2c12be811952b3b396d87ad82711 module=libcontainerd namespace=moby topic=/t>

Besides, if it wasn’t I guess none of my nodes would have restarted, whereas in my case 2 of them did restart automatically :thinking:

that’s what I wanted to see. So your docker service is enabled, and it should start.

You may check docker events

docker events --since 2022-10-17T01:37:50.993Z | head -n 30

or check journals

journalctl --since 2022-10-17T01:37:50.993Z | grep dockerd
1 Like
pac@raspberrypi:~ $ sudo docker events --since 2022-10-17T01:37:50.993Z | head -n 30
2022-10-17T04:23:29.854089533+02:00 network connect 1d15b0f1b9aecd5a6f0f341d61743507cb2f286fc22e6b10af47df50a813ea75 (container=30751f1ad8e2d2d3a25f8761ec10c4dc1d891bec2bb4c689a23f405b8f401dc0, name=bridge, type=bridge)
2022-10-17T04:23:29.872125441+02:00 network connect 1d15b0f1b9aecd5a6f0f341d61743507cb2f286fc22e6b10af47df50a813ea75 (container=2777f75cf76fbebc22386643ee439efa7fbe2c12be811952b3b396d87ad82711, name=bridge, type=bridge)
2022-10-17T04:23:30.306856737+02:00 network connect 1d15b0f1b9aecd5a6f0f341d61743507cb2f286fc22e6b10af47df50a813ea75 (container=bef94a31c99296c4c016b9b4f02a55e6e4b0910b8d95b82e3fbde167131627c9, name=bridge, type=bridge)
2022-10-17T04:23:31.343285051+02:00 network connect 1d15b0f1b9aecd5a6f0f341d61743507cb2f286fc22e6b10af47df50a813ea75 (container=cd38de9c32e424b9cee96440877af8b7297d196232bb0daf3f6ca205889bbaf1, name=bridge, type=bridge)
2022-10-17T04:23:31.425450662+02:00 network connect 1d15b0f1b9aecd5a6f0f341d61743507cb2f286fc22e6b10af47df50a813ea75 (container=186a2e99d7368261a613dd29e886a5039e4920ddb893cf85a1d3cd523e79e5cf, name=bridge, type=bridge)
2022-10-17T04:23:31.577251162+02:00 container start 30751f1ad8e2d2d3a25f8761ec10c4dc1d891bec2bb4c689a23f405b8f401dc0 (image=storjlabs/watchtower, io.storj.watchtower=true, name=storj_watchtower)
2022-10-17T04:23:31.668610069+02:00 container start 2777f75cf76fbebc22386643ee439efa7fbe2c12be811952b3b396d87ad82711 (image=storjlabs/storagenode:latest, name=storj_node_2)
2022-10-17T04:23:31.832168069+02:00 network connect 1d15b0f1b9aecd5a6f0f341d61743507cb2f286fc22e6b10af47df50a813ea75 (container=205848e0b86b10944283ad4b266f956b57baca62b626ff6360e85099efb93b80, name=bridge, type=bridge)
2022-10-17T04:23:31.991967032+02:00 network connect 1d15b0f1b9aecd5a6f0f341d61743507cb2f286fc22e6b10af47df50a813ea75 (container=3bee8582c5b4c82c509f662796e55e4e12f028a0f2a9e821dc4d80ce09840a89, name=bridge, type=bridge)
2022-10-17T04:23:32.045416087+02:00 network connect 1d15b0f1b9aecd5a6f0f341d61743507cb2f286fc22e6b10af47df50a813ea75 (container=53dc64cb118cfb001561e6ece48e24ce92967599bc72acbb978c2b1c8ed38735, name=bridge, type=bridge)
2022-10-17T04:23:32.135109513+02:00 network connect 1d15b0f1b9aecd5a6f0f341d61743507cb2f286fc22e6b10af47df50a813ea75 (container=2e329e981ec6fc911e4eb4f455a7f1cfb98516d3391082c11b87bb5a49b383d2, name=bridge, type=bridge)
2022-10-17T04:23:32.512445161+02:00 network disconnect 1d15b0f1b9aecd5a6f0f341d61743507cb2f286fc22e6b10af47df50a813ea75 (container=bef94a31c99296c4c016b9b4f02a55e6e4b0910b8d95b82e3fbde167131627c9, name=bridge, type=bridge)
2022-10-17T04:23:33.039568772+02:00 network disconnect 1d15b0f1b9aecd5a6f0f341d61743507cb2f286fc22e6b10af47df50a813ea75 (container=186a2e99d7368261a613dd29e886a5039e4920ddb893cf85a1d3cd523e79e5cf, name=bridge, type=bridge)
2022-10-17T04:23:33.182106383+02:00 network disconnect 1d15b0f1b9aecd5a6f0f341d61743507cb2f286fc22e6b10af47df50a813ea75 (container=cd38de9c32e424b9cee96440877af8b7297d196232bb0daf3f6ca205889bbaf1, name=bridge, type=bridge)
2022-10-17T04:23:33.551675957+02:00 network disconnect 1d15b0f1b9aecd5a6f0f341d61743507cb2f286fc22e6b10af47df50a813ea75 (container=205848e0b86b10944283ad4b266f956b57baca62b626ff6360e85099efb93b80, name=bridge, type=bridge)
2022-10-17T04:23:33.577883105+02:00 container start 53dc64cb118cfb001561e6ece48e24ce92967599bc72acbb978c2b1c8ed38735 (image=storjlabs/storagenode:latest, name=storj_node_1)
2022-10-17T04:23:33.741195068+02:00 network disconnect 1d15b0f1b9aecd5a6f0f341d61743507cb2f286fc22e6b10af47df50a813ea75 (container=3bee8582c5b4c82c509f662796e55e4e12f028a0f2a9e821dc4d80ce09840a89, name=bridge, type=bridge)
2022-10-17T04:23:33.809243457+02:00 network disconnect 1d15b0f1b9aecd5a6f0f341d61743507cb2f286fc22e6b10af47df50a813ea75 (container=2e329e981ec6fc911e4eb4f455a7f1cfb98516d3391082c11b87bb5a49b383d2, name=bridge, type=bridge)
2022-10-17T04:23:46.714923875+02:00 container die 2777f75cf76fbebc22386643ee439efa7fbe2c12be811952b3b396d87ad82711 (exitCode=0, image=storjlabs/storagenode:latest, name=storj_node_2)
2022-10-17T04:23:46.950395671+02:00 network disconnect 1d15b0f1b9aecd5a6f0f341d61743507cb2f286fc22e6b10af47df50a813ea75 (container=2777f75cf76fbebc22386643ee439efa7fbe2c12be811952b3b396d87ad82711, name=bridge, type=bridge)
2022-10-17T04:23:47.153467023+02:00 container die 53dc64cb118cfb001561e6ece48e24ce92967599bc72acbb978c2b1c8ed38735 (exitCode=0, image=storjlabs/storagenode:latest, name=storj_node_1)
2022-10-17T04:23:47.231422615+02:00 network connect 1d15b0f1b9aecd5a6f0f341d61743507cb2f286fc22e6b10af47df50a813ea75 (container=2777f75cf76fbebc22386643ee439efa7fbe2c12be811952b3b396d87ad82711, name=bridge, type=bridge)
2022-10-17T04:23:47.493054615+02:00 network disconnect 1d15b0f1b9aecd5a6f0f341d61743507cb2f286fc22e6b10af47df50a813ea75 (container=53dc64cb118cfb001561e6ece48e24ce92967599bc72acbb978c2b1c8ed38735, name=bridge, type=bridge)
2022-10-17T04:23:47.800732282+02:00 network connect 1d15b0f1b9aecd5a6f0f341d61743507cb2f286fc22e6b10af47df50a813ea75 (container=53dc64cb118cfb001561e6ece48e24ce92967599bc72acbb978c2b1c8ed38735, name=bridge, type=bridge)
2022-10-17T04:23:48.009217411+02:00 container start 2777f75cf76fbebc22386643ee439efa7fbe2c12be811952b3b396d87ad82711 (image=storjlabs/storagenode:latest, name=storj_node_2)
2022-10-17T04:23:48.539545892+02:00 container start 53dc64cb118cfb001561e6ece48e24ce92967599bc72acbb978c2b1c8ed38735 (image=storjlabs/storagenode:latest, name=storj_node_1)
2022-10-17T04:23:59.505098516+02:00 container die 2777f75cf76fbebc22386643ee439efa7fbe2c12be811952b3b396d87ad82711 (exitCode=0, image=storjlabs/storagenode:latest, name=storj_node_2)
2022-10-17T04:23:59.739179219+02:00 network disconnect 1d15b0f1b9aecd5a6f0f341d61743507cb2f286fc22e6b10af47df50a813ea75 (container=2777f75cf76fbebc22386643ee439efa7fbe2c12be811952b3b396d87ad82711, name=bridge, type=bridge)
2022-10-17T04:23:59.979080145+02:00 network connect 1d15b0f1b9aecd5a6f0f341d61743507cb2f286fc22e6b10af47df50a813ea75 (container=2777f75cf76fbebc22386643ee439efa7fbe2c12be811952b3b396d87ad82711, name=bridge, type=bridge)
2022-10-17T04:24:00.143441441+02:00 container die 53dc64cb118cfb001561e6ece48e24ce92967599bc72acbb978c2b1c8ed38735 (exitCode=0, image=storjlabs/storagenode:latest, name=storj_node_1)

JournalCTL returns an awful lot of lines when running journalctl --since 2022-10-17 | grep dockerd, so I’ve shortened similar series of lines with [...]:

pac@raspberrypi:~ $ journalctl --since 2022-10-17 | grep dockerd
Oct 17 00:00:18 raspberrypi dockerd[1653]: time="2022-10-17T00:00:18.739895704+02:00" level=info msg="ignoring event" container=53dc64cb118cfb001561e6ece48e24ce92967599bc72acbb978c2b1c8ed38735 module=libcontainerd namespace=moby topic=/tasks/delete type="*events.TaskDelete"
Oct 17 00:01:11 raspberrypi dockerd[1653]: time="2022-10-17T00:01:11.672142338+02:00" level=info msg="ignoring event" container=53dc64cb118cfb001561e6ece48e24ce92967599bc72acbb978c2b1c8ed38735 module=libcontainerd namespace=moby topic=/tasks/delete type="*events.TaskDelete"
[...]
Oct 17 03:27:44 raspberrypi dockerd[1653]: time="2022-10-17T03:27:43.997421023+02:00" level=info msg="ignoring event" container=53dc64cb118cfb001561e6ece48e24ce92967599bc72acbb978c2b1c8ed38735 module=libcontainerd namespace=moby topic=/tasks/delete type="*events.TaskDelete"
Oct 17 03:28:14 raspberrypi dockerd[1653]: time="2022-10-17T03:28:14.942628445+02:00" level=info msg="ignoring event" container=53dc64cb118cfb001561e6ece48e24ce92967599bc72acbb978c2b1c8ed38735 module=libcontainerd namespace=moby topic=/tasks/delete type="*events.TaskDelete"
Oct 17 03:28:19 raspberrypi dockerd[484]: time="2022-10-17T03:28:19.034172165+02:00" level=info msg="Starting up"
Oct 17 03:28:19 raspberrypi dockerd[484]: time="2022-10-17T03:28:19.055609147+02:00" level=info msg="parsed scheme: \"unix\"" module=grpc
Oct 17 03:28:19 raspberrypi dockerd[484]: time="2022-10-17T03:28:19.055739702+02:00" level=info msg="scheme \"unix\" not registered, fallback to default scheme" module=grpc
Oct 17 03:28:19 raspberrypi dockerd[484]: time="2022-10-17T03:28:19.055830998+02:00" level=info msg="ccResolverWrapper: sending update to cc: {[{unix:///run/containerd/containerd.sock  <nil> 0 <nil>}] <nil> <nil>}" module=grpc
Oct 17 03:28:19 raspberrypi dockerd[484]: time="2022-10-17T03:28:19.055869776+02:00" level=info msg="ClientConn switching balancer to \"pick_first\"" module=grpc
Oct 17 03:28:19 raspberrypi dockerd[484]: time="2022-10-17T03:28:19.072909554+02:00" level=info msg="parsed scheme: \"unix\"" module=grpc
Oct 17 03:28:19 raspberrypi dockerd[484]: time="2022-10-17T03:28:19.072987850+02:00" level=info msg="scheme \"unix\" not registered, fallback to default scheme" module=grpc
Oct 17 03:28:19 raspberrypi dockerd[484]: time="2022-10-17T03:28:19.073068610+02:00" level=info msg="ccResolverWrapper: sending update to cc: {[{unix:///run/containerd/containerd.sock  <nil> 0 <nil>}] <nil> <nil>}" module=grpc
Oct 17 03:28:19 raspberrypi dockerd[484]: time="2022-10-17T03:28:19.073119480+02:00" level=info msg="ClientConn switching balancer to \"pick_first\"" module=grpc
Oct 17 03:28:19 raspberrypi dockerd[484]: time="2022-10-17T03:28:19.126779998+02:00" level=info msg="[graphdriver] using prior storage driver: overlay2"
Oct 17 03:28:19 raspberrypi dockerd[484]: time="2022-10-17T03:28:19.360718795+02:00" level=warning msg="Unable to find memory controller"
Oct 17 03:28:19 raspberrypi dockerd[484]: time="2022-10-17T03:28:19.361460980+02:00" level=info msg="Loading containers: start."
Oct 17 03:28:22 raspberrypi dockerd[484]: time="2022-10-17T03:28:22.834300404+02:00" level=info msg="Removing stale sandbox 1f3c9d1cc1606d1c8159a29571a968380bb72b58acf9ff191a4e9cf960b067f0 (30751f1ad8e2d2d3a25f8761ec10c4dc1d891bec2bb4c689a23f405b8f401dc0)"
Oct 17 03:28:22 raspberrypi dockerd[484]: time="2022-10-17T03:28:22.861482052+02:00" level=warning msg="Error (Unable to complete atomic operation, key modified) deleting object [endpoint 47bfd76720eb12a46ab188e7fba66d422221734c517f240dda6c0d8af91f2054 9ad37aac9b055d5a17a19d0e9ca6388c2b96238f5fba3b55ee58d8d42a2d8362], retrying...."
Oct 17 03:28:23 raspberrypi dockerd[484]: time="2022-10-17T03:28:23.376477292+02:00" level=info msg="Removing stale sandbox 3a008e410af3c43f3c364d254301fb28d9aa800b485e67236e3013f23bd74307 (186a2e99d7368261a613dd29e886a5039e4920ddb893cf85a1d3cd523e79e5cf)"
Oct 17 03:28:23 raspberrypi dockerd[484]: time="2022-10-17T03:28:23.526962366+02:00" level=warning msg="Error (Unable to complete atomic operation, key modified) deleting object [endpoint 47bfd76720eb12a46ab188e7fba66d422221734c517f240dda6c0d8af91f2054 b43d111298b4b212742ef226f9be2cba6065454d00a38137396f023c4fbec38b], retrying...."
[...]
Oct 17 03:28:26 raspberrypi dockerd[484]: time="2022-10-17T03:28:26.964596216+02:00" level=info msg="Removing stale sandbox 71880c7ebc95053c27b438b4ba4d9db1ba0cd04a3aa2773c76c03d4eed7c272f (2e329e981ec6fc911e4eb4f455a7f1cfb98516d3391082c11b87bb5a49b383d2)"
Oct 17 03:28:27 raspberrypi dockerd[484]: time="2022-10-17T03:28:27.086973142+02:00" level=warning msg="Error (Unable to complete atomic operation, key modified) deleting object [endpoint 47bfd76720eb12a46ab188e7fba66d422221734c517f240dda6c0d8af91f2054 c42973fb4f057698d0c984ac1536caea0b63f919ef75198e0ad56e56975aaffe], retrying...."
Oct 17 03:28:27 raspberrypi dockerd[484]: time="2022-10-17T03:28:27.597579919+02:00" level=info msg="Removing stale sandbox 810b288b9b491fe2a625525c1756dd309d38bd7afd276cb0f1e5b9b76c6b088d (2777f75cf76fbebc22386643ee439efa7fbe2c12be811952b3b396d87ad82711)"
Oct 17 03:28:27 raspberrypi dockerd[484]: time="2022-10-17T03:28:27.727215753+02:00" level=warning msg="Error (Unable to complete atomic operation, key modified) deleting object [endpoint 47bfd76720eb12a46ab188e7fba66d422221734c517f240dda6c0d8af91f2054 6a83a2a2d8fa332840159ffbe5aeabf386918048950db21d473470592555b0ae], retrying...."
Oct 17 03:28:27 raspberrypi dockerd[484]: time="2022-10-17T03:28:27.879209641+02:00" level=info msg="Default bridge (docker0) is assigned with an IP address 172.17.0.0/16. Daemon option --bip can be used to set a preferred IP address"
Oct 17 03:28:29 raspberrypi dockerd[484]: time="2022-10-17T03:28:29.226079103+02:00" level=error msg="stream copy error: reading from a closed fifo"
Oct 17 03:28:29 raspberrypi dockerd[484]: time="2022-10-17T03:28:29.236504437+02:00" level=error msg="stream copy error: reading from a closed fifo"
[...]
Oct 17 03:28:31 raspberrypi dockerd[484]: time="2022-10-17T03:28:31.019944214+02:00" level=error msg="stream copy error: reading from a closed fifo"
Oct 17 03:28:31 raspberrypi dockerd[484]: time="2022-10-17T03:28:31.037533028+02:00" level=error msg="stream copy error: reading from a closed fifo"
Oct 17 03:28:31 raspberrypi dockerd[484]: time="2022-10-17T03:28:31.062835380+02:00" level=error msg="186a2e99d7368261a613dd29e886a5039e4920ddb893cf85a1d3cd523e79e5cf cleanup: failed to delete container from containerd: no such container"
Oct 17 03:28:31 raspberrypi dockerd[484]: time="2022-10-17T03:28:31.062919065+02:00" level=error msg="failed to start container" container=186a2e99d7368261a613dd29e886a5039e4920ddb893cf85a1d3cd523e79e5cf error="OCI runtime create failed: container_linux.go:367: starting container process caused: process_linux.go:495: container init caused: rootfs_linux.go:76: mounting \"/home/pac/storj/mounts/disk_3/storj_node_3\" to rootfs at \"/app/config\" caused: stat /home/pac/storj/mounts/disk_3/storj_node_3: no such file or directory: unknown"
Oct 17 03:28:31 raspberrypi dockerd[484]: time="2022-10-17T03:28:31.273435510+02:00" level=error msg="2e329e981ec6fc911e4eb4f455a7f1cfb98516d3391082c11b87bb5a49b383d2 cleanup: failed to delete container from containerd: no such container"
Oct 17 03:28:31 raspberrypi dockerd[484]: time="2022-10-17T03:28:31.273525843+02:00" level=error msg="failed to start container" container=2e329e981ec6fc911e4eb4f455a7f1cfb98516d3391082c11b87bb5a49b383d2 error="OCI runtime create failed: container_linux.go:367: starting container process caused: process_linux.go:495: container init caused: rootfs_linux.go:76: mounting \"/home/pac/storj/mounts/disk_3/storj_node_5\" to rootfs at \"/app/config\" caused: stat /home/pac/storj/mounts/disk_3/storj_node_5: no such file or directory: unknown"
[...]
Oct 17 03:28:31 raspberrypi dockerd[484]: time="2022-10-17T03:28:31.983878806+02:00" level=error msg="cd38de9c32e424b9cee96440877af8b7297d196232bb0daf3f6ca205889bbaf1 cleanup: failed to delete container from containerd: no such container"
Oct 17 03:28:31 raspberrypi dockerd[484]: time="2022-10-17T03:28:31.983959454+02:00" level=error msg="failed to start container" container=cd38de9c32e424b9cee96440877af8b7297d196232bb0daf3f6ca205889bbaf1 error="OCI runtime create failed: container_linux.go:367: starting container process caused: process_linux.go:495: container init caused: rootfs_linux.go:76: mounting \"/home/pac/storj/mounts/disk_4/storj_node_8\" to rootfs at \"/app/config\" caused: stat /home/pac/storj/mounts/disk_4/storj_node_8: no such file or directory: unknown"
Oct 17 03:28:32 raspberrypi dockerd[484]: time="2022-10-17T03:28:32.020281713+02:00" level=error msg="53dc64cb118cfb001561e6ece48e24ce92967599bc72acbb978c2b1c8ed38735 cleanup: failed to delete container from containerd: no such container"
Oct 17 03:28:32 raspberrypi dockerd[484]: time="2022-10-17T03:28:32.020417694+02:00" level=error msg="failed to start container" container=53dc64cb118cfb001561e6ece48e24ce92967599bc72acbb978c2b1c8ed38735 error="OCI runtime create failed: container_linux.go:367: starting container process caused: process_linux.go:495: container init caused: rootfs_linux.go:76: mounting \"/home/pac/storj/mounts/disk_1/storj_node_1\" to rootfs at \"/app/config\" caused: stat /home/pac/storj/mounts/disk_1/storj_node_1: no such file or directory: unknown"
Oct 17 03:28:32 raspberrypi dockerd[484]: time="2022-10-17T03:28:32.020530694+02:00" level=info msg="Loading containers: done."
Oct 17 03:28:32 raspberrypi dockerd[484]: time="2022-10-17T03:28:32.093915102+02:00" level=info msg="Docker daemon" commit=363e9a8 graphdriver(s)=overlay2 version=20.10.5+dfsg1
Oct 17 03:28:32 raspberrypi dockerd[484]: time="2022-10-17T03:28:32.096463028+02:00" level=info msg="Daemon has completed initialization"
Oct 17 03:28:32 raspberrypi dockerd[484]: time="2022-10-17T03:28:32.174269620+02:00" level=info msg="API listen on /run/docker.sock"
Oct 17 03:28:44 raspberrypi dockerd[484]: time="2022-10-17T03:28:44.171226872+02:00" level=info msg="ignoring event" container=2777f75cf76fbebc22386643ee439efa7fbe2c12be811952b3b396d87ad82711 module=libcontainerd namespace=moby topic=/tasks/delete type="*events.TaskDelete"
Oct 17 03:28:45 raspberrypi dockerd[1653]: time="2022-10-17T03:28:45.635255181+02:00" level=info msg="ignoring event" container=53dc64cb118cfb001561e6ece48e24ce92967599bc72acbb978c2b1c8ed38735 module=libcontainerd namespace=moby topic=/tasks/delete type="*events.TaskDelete"
[...]
Oct 17 03:45:20 raspberrypi dockerd[1653]: time="2022-10-17T03:45:20.820508110+02:00" level=info msg="ignoring event" container=53dc64cb118cfb001561e6ece48e24ce92967599bc72acbb978c2b1c8ed38735 module=libcontainerd namespace=moby topic=/tasks/delete type="*events.TaskDelete"
Oct 17 03:46:11 raspberrypi dockerd[1653]: time="2022-10-17T03:46:11.036274564+02:00" level=info msg="ignoring event" container=53dc64cb118cfb001561e6ece48e24ce92967599bc72acbb978c2b1c8ed38735 module=libcontainerd namespace=moby topic=/tasks/delete type="*events.TaskDelete"
Oct 17 04:23:26 raspberrypi dockerd[482]: time="2022-10-17T04:23:26.032939850+02:00" level=info msg="Starting up"
Oct 17 04:23:26 raspberrypi dockerd[482]: time="2022-10-17T04:23:26.046793424+02:00" level=info msg="parsed scheme: \"unix\"" module=grpc
Oct 17 04:23:26 raspberrypi dockerd[482]: time="2022-10-17T04:23:26.046891350+02:00" level=info msg="scheme \"unix\" not registered, fallback to default scheme" module=grpc
Oct 17 04:23:26 raspberrypi dockerd[482]: time="2022-10-17T04:23:26.046979202+02:00" level=info msg="ccResolverWrapper: sending update to cc: {[{unix:///run/containerd/containerd.sock  <nil> 0 <nil>}] <nil> <nil>}" module=grpc
Oct 17 04:23:26 raspberrypi dockerd[482]: time="2022-10-17T04:23:26.047030424+02:00" level=info msg="ClientConn switching balancer to \"pick_first\"" module=grpc
Oct 17 04:23:26 raspberrypi dockerd[482]: time="2022-10-17T04:23:26.057833517+02:00" level=info msg="parsed scheme: \"unix\"" module=grpc
Oct 17 04:23:26 raspberrypi dockerd[482]: time="2022-10-17T04:23:26.057962924+02:00" level=info msg="scheme \"unix\" not registered, fallback to default scheme" module=grpc
Oct 17 04:23:26 raspberrypi dockerd[482]: time="2022-10-17T04:23:26.058065480+02:00" level=info msg="ccResolverWrapper: sending update to cc: {[{unix:///run/containerd/containerd.sock  <nil> 0 <nil>}] <nil> <nil>}" module=grpc
Oct 17 04:23:26 raspberrypi dockerd[482]: time="2022-10-17T04:23:26.058123258+02:00" level=info msg="ClientConn switching balancer to \"pick_first\"" module=grpc
Oct 17 04:23:26 raspberrypi dockerd[482]: time="2022-10-17T04:23:26.138001128+02:00" level=info msg="[graphdriver] using prior storage driver: overlay2"
Oct 17 04:23:26 raspberrypi dockerd[482]: time="2022-10-17T04:23:26.345418943+02:00" level=warning msg="Unable to find memory controller"
Oct 17 04:23:26 raspberrypi dockerd[482]: time="2022-10-17T04:23:26.346039239+02:00" level=info msg="Loading containers: start."
Oct 17 04:23:28 raspberrypi dockerd[482]: time="2022-10-17T04:23:28.459194423+02:00" level=info msg="Removing stale sandbox 3207d5d8bc853e20dec70c61ce3765e0417616f7cc1840472b70a2f6bd0ad9e4 (30751f1ad8e2d2d3a25f8761ec10c4dc1d891bec2bb4c689a23f405b8f401dc0)"
Oct 17 04:23:28 raspberrypi dockerd[482]: time="2022-10-17T04:23:28.490711201+02:00" level=warning msg="Error (Unable to complete atomic operation, key modified) deleting object [endpoint a827ee56de00a6e5efe0018147d06a94c7c5550d5069b523cb80420bf5bcedde a6f51e474946570731eaa34744dfd6a7245f51ffe8c83376c8f675954b7c6d54], retrying...."
Oct 17 04:23:29 raspberrypi dockerd[482]: time="2022-10-17T04:23:29.009590274+02:00" level=info msg="Removing stale sandbox fb0e1ca547bbd60ffd3633fda12b19f1da44ea2b9cd4e664fa920772307b9339 (2777f75cf76fbebc22386643ee439efa7fbe2c12be811952b3b396d87ad82711)"
Oct 17 04:23:29 raspberrypi dockerd[482]: time="2022-10-17T04:23:29.121989885+02:00" level=warning msg="Error (Unable to complete atomic operation, key modified) deleting object [endpoint a827ee56de00a6e5efe0018147d06a94c7c5550d5069b523cb80420bf5bcedde dc3409e5fcebb93d8f84f5c59dcd4fc77d115bacd02be3d0911a4912364c98e8], retrying...."
Oct 17 04:23:29 raspberrypi dockerd[482]: time="2022-10-17T04:23:29.257495311+02:00" level=info msg="Default bridge (docker0) is assigned with an IP address 172.17.0.0/16. Daemon option --bip can be used to set a preferred IP address"
Oct 17 04:23:30 raspberrypi dockerd[482]: time="2022-10-17T04:23:30.808259625+02:00" level=error msg="stream copy error: reading from a closed fifo"
Oct 17 04:23:30 raspberrypi dockerd[482]: time="2022-10-17T04:23:30.811096959+02:00" level=error msg="stream copy error: reading from a closed fifo"
[...]
Oct 17 04:23:32 raspberrypi dockerd[482]: time="2022-10-17T04:23:32.545451735+02:00" level=error msg="stream copy error: reading from a closed fifo"
Oct 17 04:23:32 raspberrypi dockerd[482]: time="2022-10-17T04:23:32.546036735+02:00" level=error msg="stream copy error: reading from a closed fifo"
Oct 17 04:23:32 raspberrypi dockerd[482]: time="2022-10-17T04:23:32.602075161+02:00" level=error msg="bef94a31c99296c4c016b9b4f02a55e6e4b0910b8d95b82e3fbde167131627c9 cleanup: failed to delete container from containerd: no such container"
Oct 17 04:23:32 raspberrypi dockerd[482]: time="2022-10-17T04:23:32.602301957+02:00" level=error msg="failed to start container" container=bef94a31c99296c4c016b9b4f02a55e6e4b0910b8d95b82e3fbde167131627c9 error="OCI runtime create failed: container_linux.go:367: starting container process caused: process_linux.go:495: container init caused: rootfs_linux.go:76: mounting \"/home/pac/storj/mounts/disk_4/storj_node_4\" to rootfs at \"/app/config\" caused: stat /home/pac/storj/mounts/disk_4/storj_node_4: no such file or directory: unknown"
Oct 17 04:23:32 raspberrypi dockerd[482]: time="2022-10-17T04:23:32.619041828+02:00" level=error msg="stream copy error: reading from a closed fifo"
Oct 17 04:23:32 raspberrypi dockerd[482]: time="2022-10-17T04:23:32.621576957+02:00" level=error msg="stream copy error: reading from a closed fifo"
Oct 17 04:23:33 raspberrypi dockerd[482]: time="2022-10-17T04:23:33.064925846+02:00" level=error msg="186a2e99d7368261a613dd29e886a5039e4920ddb893cf85a1d3cd523e79e5cf cleanup: failed to delete container from containerd: no such container"
Oct 17 04:23:33 raspberrypi dockerd[482]: time="2022-10-17T04:23:33.065011142+02:00" level=error msg="failed to start container" container=186a2e99d7368261a613dd29e886a5039e4920ddb893cf85a1d3cd523e79e5cf error="OCI runtime create failed: container_linux.go:367: starting container process caused: process_linux.go:495: container init caused: rootfs_linux.go:76: mounting \"/home/pac/storj/mounts/disk_3/storj_node_3\" to rootfs at \"/app/config\" caused: stat /home/pac/storj/mounts/disk_3/storj_node_3: no such file or directory: unknown"
Oct 17 04:23:33 raspberrypi dockerd[482]: time="2022-10-17T04:23:33.231030698+02:00" level=error msg="cd38de9c32e424b9cee96440877af8b7297d196232bb0daf3f6ca205889bbaf1 cleanup: failed to delete container from containerd: no such container"
Oct 17 04:23:33 raspberrypi dockerd[482]: time="2022-10-17T04:23:33.231134050+02:00" level=error msg="failed to start container" container=cd38de9c32e424b9cee96440877af8b7297d196232bb0daf3f6ca205889bbaf1 error="OCI runtime create failed: container_linux.go:367: starting container process caused: process_linux.go:495: container init caused: rootfs_linux.go:76: mounting \"/home/pac/storj/mounts/disk_4/storj_node_8\" to rootfs at \"/app/config\" caused: stat /home/pac/storj/mounts/disk_4/storj_node_8: no such file or directory: unknown"
[...]
Oct 17 04:23:33 raspberrypi dockerd[482]: time="2022-10-17T04:23:33.770762753+02:00" level=error msg="3bee8582c5b4c82c509f662796e55e4e12f028a0f2a9e821dc4d80ce09840a89 cleanup: failed to delete container from containerd: no such container"
Oct 17 04:23:33 raspberrypi dockerd[482]: time="2022-10-17T04:23:33.770890068+02:00" level=error msg="failed to start container" container=3bee8582c5b4c82c509f662796e55e4e12f028a0f2a9e821dc4d80ce09840a89 error="OCI runtime create failed: container_linux.go:367: starting container process caused: process_linux.go:495: container init caused: rootfs_linux.go:76: mounting \"/home/pac/storj/mounts/disk_4/storj_node_6\" to rootfs at \"/app/config\" caused: stat /home/pac/storj/mounts/disk_4/storj_node_6: no such file or directory: unknown"
Oct 17 04:23:33 raspberrypi dockerd[482]: time="2022-10-17T04:23:33.826761142+02:00" level=error msg="2e329e981ec6fc911e4eb4f455a7f1cfb98516d3391082c11b87bb5a49b383d2 cleanup: failed to delete container from containerd: no such container"
Oct 17 04:23:33 raspberrypi dockerd[482]: time="2022-10-17T04:23:33.826861790+02:00" level=error msg="failed to start container" container=2e329e981ec6fc911e4eb4f455a7f1cfb98516d3391082c11b87bb5a49b383d2 error="OCI runtime create failed: container_linux.go:367: starting container process caused: process_linux.go:495: container init caused: rootfs_linux.go:76: mounting \"/home/pac/storj/mounts/disk_3/storj_node_5\" to rootfs at \"/app/config\" caused: stat /home/pac/storj/mounts/disk_3/storj_node_5: no such file or directory: unknown"
Oct 17 04:23:33 raspberrypi dockerd[482]: time="2022-10-17T04:23:33.826966623+02:00" level=info msg="Loading containers: done."
Oct 17 04:23:33 raspberrypi dockerd[482]: time="2022-10-17T04:23:33.884556660+02:00" level=info msg="Docker daemon" commit=363e9a8 graphdriver(s)=overlay2 version=20.10.5+dfsg1
Oct 17 04:23:33 raspberrypi dockerd[482]: time="2022-10-17T04:23:33.885653105+02:00" level=info msg="Daemon has completed initialization"
Oct 17 04:23:33 raspberrypi dockerd[482]: time="2022-10-17T04:23:33.965510605+02:00" level=info msg="API listen on /run/docker.sock"
Oct 17 04:23:46 raspberrypi dockerd[482]: time="2022-10-17T04:23:46.713559431+02:00" level=info msg="ignoring event" container=2777f75cf76fbebc22386643ee439efa7fbe2c12be811952b3b396d87ad82711 module=libcontainerd namespace=moby topic=/tasks/delete type="*events.TaskDelete"
Oct 17 04:23:47 raspberrypi dockerd[482]: time="2022-10-17T04:23:47.151514597+02:00" level=info msg="ignoring event" container=53dc64cb118cfb001561e6ece48e24ce92967599bc72acbb978c2b1c8ed38735 module=libcontainerd namespace=moby topic=/tasks/delete type="*events.TaskDelete"
[...]
Oct 17 04:24:25 raspberrypi dockerd[482]: time="2022-10-17T04:24:25.083825389+02:00" level=info msg="ignoring event" container=53dc64cb118cfb001561e6ece48e24ce92967599bc72acbb978c2b1c8ed38735 module=libcontainerd namespace=moby topic=/tasks/delete type="*events.TaskDelete"
Oct 17 04:24:25 raspberrypi dockerd[482]: time="2022-10-17T04:24:25.797009907+02:00" level=info msg="ignoring event" container=2777f75cf76fbebc22386643ee439efa7fbe2c12be811952b3b396d87ad82711 module=libcontainerd namespace=moby topic=/tasks/delete type="*events.TaskDelete"

It seems your disk was not ready when the docker daemon started.
How did you mount your disks?

1 Like

They’re all declared in /etc/fstab (4 disks), all configured like so:

# STORJ
# Toshiba 2.5" 2TB (disk 1)
UUID=71e83004-16a0-4fb0-8057-f37fe3a34abc /home/pac/storj/mounts/disk_1 auto defaults,noatime,nofail 0 2
# HGST 2.5" 1TB (disk 2)
UUID=d0431e79-ecc0-48bc-873a-b22e74deba8c /home/pac/storj/mounts/disk_2 auto defaults,noatime,nofail 0 2
[...]

Could it be that after a sudden system restart some kind of disk checks got run on some of the disks, preventing them from being accessible until the check finished?

What’s weird is that all nodes (which means all disks) did show some errors like the one you pointed out: “no such file or directory”.
Which suggests that disks 1 & 2 did take some time to start up too, but were quick enough for other retries to work down the line, whereas disks 3 & 4 did take too long? :thinking:

(Indeed nodes 1 & 2 [on disks 1 & 2] did restart correctly and automatically in the end, where all the other ones failed)

Hmm… so, … as it was kind of getting hotter than usual I opened my RPi’s box to check whether its fan was still operating normally, and I’m not sure what I did but I shouldn’t have touched it apparently coz that completely froze the poor thing.

It was no more reachable via SSH even after a forced reboot, so I ended up connecting a screen to it to check what’s up and I don’t like what I see:

The boot sequence is now stuck there, looping forever as if the USB controller was unable to find any disk?
Note: This RPi is configured to boot from USB (then SD card as a fallback - but no SD card is installed).

So maybe the strange issue I faced initially in this thread was just the first sign of a hardware failure? I’ll have to investigate further this weekend but I’m not optimistic I must say ^^’

Please check cables, USB hub, power supply.
Perhaps poweroff and disconnect from the grid for several minutes could be enough to ressurect it.

1 Like

Thanks again @Alexey for your support.
I tried turning it off for the whole night. Did not restart normally this morning. Tried unplugging everything, and plugging all back again, did not work.
image


I ended up swapping my whole RPi 4B motherboard with the one a friend of mine had laying around unused and all my setup came back from the dead immediately! It’s now back online and working normally! (I just had to re-setup a static address dedicated to this new RPi’s MAC address so port forwarding works correctly)

So I guess some part of my RPi’s hardware kinda died! oO
It had been working non-stop for 3 years, I’m not sure if it’s a long time or not for a RPi which is certainly not a piece of hardware supposed to be working 24/7, but computers usually last longer than that…

I’ll carry out some more testing this weekend, reboot it with a fresh SD card and see what’s going on, maybe update the firmware who knows… But for now, it really seems like it’s a hardware issue with my RPi.

3 Likes

Wow just died like that. How do you power your Pi? Adapter, psu?

I had some issue in the past with usb connections (connectors) that sounded similar like yours. For other people ending up here: tiewrap your usb-sata connectors. Even the slightest movements can cause connection loss. And only restored by reboot or phisical out/in of the usb cable. Even better, buy some type of raid enclosure.

Another time I didn’t plugin the lan cable after maintainance…

Well, not exactly “just like that”, as I did open its case to check on the fan because I was wondering if it was still running as the Pi was becoming unusually hot (near 70°C, where it usually stays below 60°C). So I guess the lesson here is never to open a computer while it’s powered on… although I did it millions of times without issues before… static electricity fried it maybe, or whatnot…
I wouldn’t have been surprised if tampering with the Pi would have restarted it or powered it off. But I really did not think touching it could break it :frowning:

Or maybe it was already kinda dead (it was behaving in a weird way since a few days) and touching it just triggered its final breath… I dunno.

I did run a few tests this WE and tried flashing an SD card (which used to work with an RPi 4B) with a fresh version of Pi OS 64, with the official tool from raspberrypi.com but I could not get past the rainbow screen.

  • With no SD card: shows the never-ending loop searching for an SD card and a USB device to boot from (as shown in one of my previous posts)
  • With the SD card: Stuck on the rainbow screen and 7 flashes of the green light, telling me it cannot find the kernel image. I checked by plugging the SD card to my laptop: the kernel image file is on the boot partition…

So… I’m not sure what else to try, if the thing cannot read the SD card :confused:


My RPi 4B used the official power adapter. No PSU though. And you’re right, USB connections can be unstable…

1 Like