Nodes all die, RPi 4 unresponsive, "panic: invalid page type: 4: 10"

Since a couple of weeks, I keep having severe issues with my RPi 4B setup.

It’s kind of since I switched my nodes to version 1.52+ but I don’t know if it’s related.

Yesterday, I fully re-installed my RPi 4B so it’s finally running a 64bit OS (Raspbian OS).

But I still have issues, it feels even worse…

Most of the time, when all my nodes die, the RPi is not even reachable via SSH anymore :frowning:

I’m really not sure what’s wrong, how to investigate, and even if it’s related to STORJ at all to be honest.

Here is an excerpt of my syslog when problems arose again, this afternoon:


Apr 29 17:23:46 raspberrypi containerd[755]: panic: invalid page type: 4: 10

Apr 29 17:23:51 raspberrypi containerd[755]: goroutine 3390 [running]:

Apr 29 17:23:51 raspberrypi containerd[755]: go.etcd.io/bbolt.(*Cursor).search(0x400053d2e8, {0x558de87544, 0x6, 0x6}, 0x4)

Apr 29 17:23:51 raspberrypi containerd[755]: #011/go/src/github.com/containerd/containerd/vendor/go.etcd.io/bbolt/cursor.go:250 +0x29c

Apr 29 17:23:51 raspberrypi containerd[755]: go.etcd.io/bbolt.(*Cursor).searchPage(0x400053d2e8, {0x558de87544, 0x6, 0x6}, 0x7f840b5000)

Apr 29 17:23:51 raspberrypi containerd[755]: #011/go/src/github.com/containerd/containerd/vendor/go.etcd.io/bbolt/cursor.go:308 +0x140

Apr 29 17:23:51 raspberrypi containerd[755]: go.etcd.io/bbolt.(*Cursor).search(0x400053d2e8, {0x558de87544, 0x6, 0x6}, 0x8)

Apr 29 17:23:51 raspberrypi containerd[755]: #011/go/src/github.com/containerd/containerd/vendor/go.etcd.io/bbolt/cursor.go:265 +0x1c0

Apr 29 17:23:52 raspberrypi containerd[755]: go.etcd.io/bbolt.(*Cursor).seek(0x400053d2e8, {0x558de87544, 0x6, 0x6})

[...]

Apr 29 17:23:52 raspberrypi containerd[755]: #011/go/src/github.com/containerd/containerd/metadata/containers.go:94 +0x12c

Apr 29 17:23:52 raspberrypi containerd[755]: github.com/containerd/containerd/metadata.view({0x558d4dcf20, 0x40000cf860}, {0x558d4bf418, 0x400050f260}, 0x40000cf890)

Apr 29 17:23:52 raspberrypi containerd[755]: #011/go/src/github.com/containerd/containerd/metadata/bolt.go:48 +0x70

Apr 29 17:23:52 raspberrypi containerd[755]: github.com/containerd/containerd/metadata.(*containerStore).List(0x4000010110, {0x558d4dcf20, 0x40000cf860}, {0x400048eb60, 0x1, 0x1})

Apr 29 17:23:52 raspberrypi containerd[755]: #011/go/src/github.com/containerd/containerd/metadata/containers.go:88 +0x160

Apr 29 17:23:52 raspberrypi containerd[755]: github.com/containerd/containerd/services/containers.(*local).ListStream.func1({0x558d4dcf20, 0x40000cf860})

Apr 29 17:23:52 raspberrypi containerd[755]: #011/go/src/github.com/containerd/containerd/services/containers/local.go:102 +0x68

Apr 29 17:23:52 raspberrypi containerd[755]: github.com/containerd/containerd/services/containers.(*local).withStore.func1(0x4000294a80)

Apr 29 17:23:52 raspberrypi containerd[755]: #011/go/src/github.com/containerd/containerd/services/containers/local.go:198 +0x80

Apr 29 17:23:52 raspberrypi containerd[755]: go.etcd.io/bbolt.(*DB).View(0x400014cc00, 0x40001628a0)

Apr 29 17:23:52 raspberrypi containerd[755]: #011/go/src/github.com/containerd/containerd/vendor/go.etcd.io/bbolt/db.go:725 +0x84

Apr 29 17:23:52 raspberrypi containerd[755]: github.com/containerd/containerd/metadata.(*DB).View(...)

Apr 29 17:23:52 raspberrypi containerd[755]: #011/go/src/github.com/containerd/containerd/metadata/db.go:238

Apr 29 17:23:52 raspberrypi containerd[755]: github.com/containerd/containerd/services/containers.(*local).withStoreView(0x40005119b0, {0x558d4dcf20, 0x40000cf800}, 0x4000162840)

Apr 29 17:23:52 raspberrypi containerd[755]: #011/go/src/github.com/containerd/containerd/services/containers/local.go:203 +0x5c

Apr 29 17:23:52 raspberrypi containerd[755]: github.com/containerd/containerd/services/containers.(*local).ListStream(0x40005119b0, {0x558d4dcf20, 0x40000cf800}, 0x40005f5200, {0x0, 0x0, 0x0})

Apr 29 17:23:52 raspberrypi containerd[755]: #011/go/src/github.com/containerd/containerd/services/containers/local.go:101 +0xf4

Apr 29 17:23:52 raspberrypi containerd[755]: github.com/containerd/containerd.(*remoteContainers).stream(0x40001e61d0, {0x558d4dcf20, 0x40000cf800}, {0x400048eb60, 0x1, 0x1})

Apr 29 17:23:52 raspberrypi containerd[755]: #011/go/src/github.com/containerd/containerd/containerstore.go:80 +0xa0

Apr 29 17:23:52 raspberrypi containerd[755]: github.com/containerd/containerd.(*remoteContainers).List(0x40001e61d0, {0x558d4dcf20, 0x40000cf800}, {0x400048eb60, 0x1, 0x1})

Apr 29 17:23:52 raspberrypi containerd[755]: #011/go/src/github.com/containerd/containerd/containerstore.go:57 +0x50

Apr 29 17:23:52 raspberrypi containerd[755]: github.com/containerd/containerd.(*Client).Containers(0x40001e8000, {0x558d4dcf20, 0x40000cf800}, {0x400048eb60, 0x1, 0x1})

Apr 29 17:23:52 raspberrypi containerd[755]: #011/go/src/github.com/containerd/containerd/client.go:257 +0x6c

Apr 29 17:23:52 raspberrypi containerd[755]: github.com/containerd/containerd/runtime/restart/monitor.(*monitor).monitor(0x4000194038, {0x558d4dcf20, 0x40000cf800})

Apr 29 17:23:52 raspberrypi systemd[1]: containerd.service: Main process exited, code=exited, status=2/INVALIDARGUMENT

Apr 29 17:23:53 raspberrypi dockerd[796]: time="2022-04-29T17:23:51.059674569+02:00" level=warning msg="grpc: addrConn.createTransport failed to connect to {unix:///run/containerd/containerd.sock <nil> 0 <nil>}. Err :connection error: desc = \"transport: Error while dialing dial unix /run/containerd/containerd.sock: connect: connection refused\". Reconnecting..." module=grpc

Apr 29 17:23:53 raspberrypi dockerd[796]: time="2022-04-29T17:23:51.059674625+02:00" level=warning msg="grpc: addrConn.createTransport failed to connect to {unix:///run/containerd/containerd.sock <nil> 0 <nil>}. Err :connection error: desc = \"transport: Error while dialing dial unix /run/containerd/containerd.sock: connect: connection refused\". Reconnecting..." module=grpc

Apr 29 17:23:53 raspberrypi dockerd[796]: time="2022-04-29T17:23:51.322674204+02:00" level=error msg="Failed to get event" error="rpc error: code = Unavailable desc = transport is closing" module=libcontainerd namespace=moby

Apr 29 17:23:53 raspberrypi dockerd[796]: time="2022-04-29T17:23:51.533806944+02:00" level=info msg="Waiting for containerd to be ready to restart event processing" module=libcontainerd namespace=moby

Apr 29 17:23:53 raspberrypi dockerd[796]: time="2022-04-29T17:23:51.322687629+02:00" level=error msg="Failed to get event" error="rpc error: code = Unavailable desc = transport is closing" module=libcontainerd namespace=plugins.moby

Apr 29 17:23:53 raspberrypi dockerd[796]: time="2022-04-29T17:23:51.534439452+02:00" level=info msg="Waiting for containerd to be ready to restart event processing" module=libcontainerd namespace=plugins.moby

Apr 29 17:23:53 raspberrypi dockerd[796]: time="2022-04-29T17:23:52.060505961+02:00" level=warning msg="grpc: addrConn.createTransport failed to connect to {unix:///run/containerd/containerd.sock <nil> 0 <nil>}. Err :connection error: desc = \"transport: Error while dialing dial unix /run/containerd/containerd.sock: connect: connection refused\". Reconnecting..." module=grpc

Apr 29 17:23:53 raspberrypi dockerd[796]: time="2022-04-29T17:23:52.060758198+02:00" level=warning msg="grpc: addrConn.createTransport failed to connect to {unix:///run/containerd/containerd.sock <nil> 0 <nil>}. Err :connection error: desc = \"transport: Error while dialing dial unix /run/containerd/containerd.sock: connect: connection refused\". Reconnecting..." module=grpc

Apr 29 17:23:53 raspberrypi containerd[755]: #011/go/src/github.com/containerd/containerd/runtime/restart/monitor/monitor.go:199 +0xe4

Apr 29 17:23:53 raspberrypi containerd[755]: github.com/containerd/containerd/runtime/restart/monitor.(*monitor).reconcile.func1(0x40004fa780, {0x558d4dceb0, 0x400012a000}, {0x40004fa6ec, 0x4}, 0x4000194038)

Apr 29 17:23:53 raspberrypi containerd[755]: #011/go/src/github.com/containerd/containerd/runtime/restart/monitor/monitor.go:175 +0x8c

Apr 29 17:23:53 raspberrypi containerd[755]: created by github.com/containerd/containerd/runtime/restart/monitor.(*monitor).reconcile

Apr 29 17:23:53 raspberrypi containerd[755]: #011/go/src/github.com/containerd/containerd/runtime/restart/monitor/monitor.go:172 +0x14c

Apr 29 17:23:53 raspberrypi systemd[1]: containerd.service: Failed with result 'exit-code'.

Apr 29 17:23:53 raspberrypi dockerd[796]: time="2022-04-29T17:23:52.332437657+02:00" level=info msg="Processing signal 'terminated'"

Apr 29 17:23:53 raspberrypi dockerd[796]: time="2022-04-29T17:23:53.362020369+02:00" level=warning msg="grpc: addrConn.createTransport failed to connect to {unix:///run/containerd/containerd.sock <nil> 0 <nil>}. Err :connection error: desc = \"transport: Error while dialing dial unix /run/containerd/containerd.sock: connect: connection refused\". Reconnecting..." module=grpc

Apr 29 17:23:53 raspberrypi dockerd[796]: time="2022-04-29T17:23:53.776102944+02:00" level=warning msg="grpc: addrConn.createTransport failed to connect to {unix:///run/containerd/containerd.sock <nil> 0 <nil>}. Err :connection error: desc = \"transport: Error while dialing dial unix /run/containerd/containerd.sock: connect: connection refused\". Reconnecting..." module=grpc

Apr 29 17:23:53 raspberrypi systemd[1]: containerd.service: Unit process 2666 (containerd-shim) remains running after unit stopped.

Apr 29 17:23:53 raspberrypi systemd[1]: containerd.service: Unit process 2667 (containerd-shim) remains running after unit stopped.

Apr 29 17:23:53 raspberrypi systemd[1]: containerd.service: Unit process 2791 (containerd-shim) remains running after unit stopped.

Apr 29 17:23:53 raspberrypi systemd[1]: containerd.service: Unit process 2816 (containerd-shim) remains running after unit stopped.

Apr 29 17:23:53 raspberrypi systemd[1]: containerd.service: Unit process 2888 (containerd-shim) remains running after unit stopped.

Apr 29 17:23:53 raspberrypi systemd[1]: containerd.service: Unit process 2910 (containerd-shim) remains running after unit stopped.

Apr 29 17:23:53 raspberrypi systemd[1]: containerd.service: Unit process 2935 (containerd-shim) remains running after unit stopped.

Apr 29 17:23:53 raspberrypi systemd[1]: containerd.service: Unit process 3004 (containerd-shim) remains running after unit stopped.

Apr 29 17:23:53 raspberrypi systemd[1]: containerd.service: Unit process 4491 (containerd-shim) remains running after unit stopped.

Apr 29 17:23:53 raspberrypi systemd[1]: containerd.service: Consumed 1min 1.770s CPU time.

Apr 29 17:23:53 raspberrypi systemd[1]: containerd.service: Scheduled restart job, restart counter is at 1.

Apr 29 17:23:53 raspberrypi systemd[1]: Stopping Docker Application Container Engine...

Apr 29 17:23:56 raspberrypi dockerd[796]: time="2022-04-29T17:23:56.098232973+02:00" level=error msg="Error sending stop (signal 15) to container" container=fc7ae703b40615ceb18a0fa6380114ed03488483ba6727103889379a6dada128 error="Cannot kill container fc7ae703b40615ceb18a0fa6380114ed03488483ba6727103889379a6dada128: connection error: desc = \"transport: Error while dialing dial unix /run/containerd/containerd.sock: connect: connection refused\": unavailable"

Apr 29 17:23:56 raspberrypi dockerd[796]: time="2022-04-29T17:23:56.100560288+02:00" level=info msg="Container failed to exit within 2s of signal 15 - using the force" container=fc7ae703b40615ceb18a0fa6380114ed03488483ba6727103889379a6dada128

Apr 29 17:23:56 raspberrypi dockerd[796]: time="2022-04-29T17:23:56.098310138+02:00" level=error msg="Error sending stop (signal 15) to container" container=aa983aee1ba6ef529bd8e598fc11ace55c59a82f55d1c04537c4df10ecfc3be8 error="Cannot kill container aa983aee1ba6ef529bd8e598fc11ace55c59a82f55d1c04537c4df10ecfc3be8: connection error: desc = \"transport: Error while dialing dial unix /run/containerd/containerd.sock: connect: connection refused\": unavailable"

Apr 29 17:23:56 raspberrypi dockerd[796]: time="2022-04-29T17:23:56.098329360+02:00" level=error msg="Error sending stop (signal 15) to container" container=0c1acf094c544dcdcc0f02ce77699b8f7242c9aab026df1f529ce2dba7c6cc4f error="Cannot kill container 0c1acf094c544dcdcc0f02ce77699b8f7242c9aab026df1f529ce2dba7c6cc4f: connection error: desc = \"transport: Error while dialing dial unix /run/containerd/containerd.sock: connect: connection refused\": unavailable"

Apr 29 17:23:56 raspberrypi dockerd[796]: time="2022-04-29T17:23:56.098339212+02:00" level=error msg="Error sending stop (signal 15) to container" container=706a7626999f5b38c1c7a9a68abda72c6f670a633b749356e49ea8bc68be4b69 error="Cannot kill container 706a7626999f5b38c1c7a9a68abda72c6f670a633b749356e49ea8bc68be4b69: connection error: desc = \"transport: Error while dialing dial unix /run/containerd/containerd.sock: connect: connection refused\": unavailable"

Apr 29 17:23:56 raspberrypi dockerd[796]: time="2022-04-29T17:23:56.101662326+02:00" level=info msg="Container failed to exit within 2s of signal 15 - using the force" container=706a7626999f5b38c1c7a9a68abda72c6f670a633b749356e49ea8bc68be4b69

Apr 29 17:23:56 raspberrypi dockerd[796]: time="2022-04-29T17:23:56.098396545+02:00" level=error msg="Error sending stop (signal 15) to container" container=9cea501c1261e75a954f1011280068b9b633b40108a295f03ba46e712ccde3c4 error="Cannot kill container 9cea501c1261e75a954f1011280068b9b633b40108a295f03ba46e712ccde3c4: connection error: desc = \"transport: Error while dialing dial unix /run/containerd/containerd.sock: connect: connection refused\": unavailable"

Apr 29 17:23:56 raspberrypi dockerd[796]: time="2022-04-29T17:23:56.102255298+02:00" level=info msg="Container failed to exit within 2s of signal 15 - using the force" container=9cea501c1261e75a954f1011280068b9b633b40108a295f03ba46e712ccde3c4

[...]

Apr 29 17:24:05 raspberrypi dockerd[796]: time="2022-04-29T17:24:05.654711372+02:00" level=warning msg="grpc: addrConn.createTransport failed to connect to {unix:///run/containerd/containerd.sock <nil> 0 <nil>}. Err :connection error: desc = \"transport: Error while dialing dial unix /run/containerd/containerd.sock: connect: connection refused\". Reconnecting..." module=grpc

Apr 29 17:24:05 raspberrypi dockerd[796]: time="2022-04-29T17:24:05.796465464+02:00" level=warning msg="grpc: addrConn.createTransport failed to connect to {unix:///run/containerd/containerd.sock <nil> 0 <nil>}. Err :connection error: desc = \"transport: Error while dialing dial unix /run/containerd/containerd.sock: connect: connection refused\". Reconnecting..." module=grpc

Apr 29 17:24:06 raspberrypi dockerd[796]: time="2022-04-29T17:24:06.101522544+02:00" level=error msg="Container failed to exit within 10 seconds of kill - trying direct SIGKILL" container=fc7ae703b40615ceb18a0fa6380114ed03488483ba6727103889379a6dada128 error="context deadline exceeded"

Apr 29 17:24:06 raspberrypi dockerd[796]: time="2022-04-29T17:24:06.102453103+02:00" level=error msg="Container failed to exit within 10 seconds of kill - trying direct SIGKILL" container=706a7626999f5b38c1c7a9a68abda72c6f670a633b749356e49ea8bc68be4b69 error="context deadline exceeded"

[...]

Apr 29 17:24:06 raspberrypi dockerd[796]: time="2022-04-29T17:24:06.110522270+02:00" level=error msg="Container failed to exit within 10 seconds of kill - trying direct SIGKILL" container=0c1acf094c544dcdcc0f02ce77699b8f7242c9aab026df1f529ce2dba7c6cc4f error="context deadline exceeded"

Apr 29 17:24:06 raspberrypi dockerd[796]: time="2022-04-29T17:24:06.110438624+02:00" level=error msg="Container failed to exit within 10 seconds of kill - trying direct SIGKILL" container=52f19724fe403bb849c92a2e2b5a4a55a1bdbf73b4091f53ed71642a8a2fb5c4 error="context deadline exceeded"

Apr 29 17:24:06 raspberrypi systemd[1]: docker-9cea501c1261e75a954f1011280068b9b633b40108a295f03ba46e712ccde3c4.scope: Succeeded.

Apr 29 17:24:06 raspberrypi systemd[1]: docker-9cea501c1261e75a954f1011280068b9b633b40108a295f03ba46e712ccde3c4.scope: Consumed 5.603s CPU time.

Apr 29 17:24:06 raspberrypi systemd[1]: docker-52f19724fe403bb849c92a2e2b5a4a55a1bdbf73b4091f53ed71642a8a2fb5c4.scope: Succeeded.

[...]

Apr 29 17:24:06 raspberrypi systemd[1]: docker-aa983aee1ba6ef529bd8e598fc11ace55c59a82f55d1c04537c4df10ecfc3be8.scope: Consumed 2min 59.448s CPU time.

Apr 29 17:24:06 raspberrypi systemd[1]: docker-f0719815a3eb6699e52b7dd88afb9c98ad4219c579f3388d714640b19cda7229.scope: Succeeded.

Apr 29 17:24:06 raspberrypi systemd[1]: docker-f0719815a3eb6699e52b7dd88afb9c98ad4219c579f3388d714640b19cda7229.scope: Consumed 23min 26.852s CPU time.

Apr 29 17:24:08 raspberrypi dockerd[796]: time="2022-04-29T17:24:08.487178710+02:00" level=error msg="Force shutdown daemon"

Apr 29 17:24:08 raspberrypi dockerd[796]: time="2022-04-29T17:24:08.487278616+02:00" level=info msg="Daemon shutdown complete"

Apr 29 17:24:08 raspberrypi systemd[1]: docker.service: Succeeded.

Apr 29 17:24:08 raspberrypi systemd[1]: docker.service: Unit process 1902 (docker-proxy) remains running after unit stopped.

Apr 29 17:24:08 raspberrypi systemd[1]: docker.service: Unit process 1913 (docker-proxy) remains running after unit stopped.

[...]

Apr 29 17:24:08 raspberrypi systemd[1]: docker.service: Unit process 4465 (docker-proxy) remains running after unit stopped.

Apr 29 17:24:08 raspberrypi systemd[1]: docker.service: Unit process 4472 (docker-proxy) remains running after unit stopped.

Apr 29 17:24:08 raspberrypi systemd[1]: Stopped Docker Application Container Engine.

Apr 29 17:24:08 raspberrypi systemd[1]: docker.service: Consumed 33.599s CPU time.

Apr 29 17:24:08 raspberrypi systemd[1]: Stopped containerd container runtime.

Apr 29 17:24:08 raspberrypi systemd[1]: containerd.service: Consumed 1min 1.890s CPU time.

Apr 29 17:24:08 raspberrypi systemd[1]: containerd.service: Found left-over process 2666 (containerd-shim) in control group while starting unit. Ignoring.

Apr 29 17:24:08 raspberrypi systemd[1]: This usually indicates unclean termination of a previous run, or service implementation deficiencies.

Apr 29 17:24:08 raspberrypi systemd[1]: containerd.service: Found left-over process 2667 (containerd-shim) in control group while starting unit. Ignoring.

[...]

Apr 29 17:24:08 raspberrypi systemd[1]: containerd.service: Found left-over process 4491 (containerd-shim) in control group while starting unit. Ignoring.

Apr 29 17:24:08 raspberrypi systemd[1]: This usually indicates unclean termination of a previous run, or service implementation deficiencies.

Apr 29 17:24:08 raspberrypi systemd[1]: Starting containerd container runtime...

Apr 29 17:24:09 raspberrypi systemd[1]: containerd.service: Found left-over process 2666 (containerd-shim) in control group while starting unit. Ignoring.

Apr 29 17:24:09 raspberrypi systemd[1]: This usually indicates unclean termination of a previous run, or service implementation deficiencies.

Apr 29 17:24:09 raspberrypi systemd[1]: containerd.service: Found left-over process 2667 (containerd-shim) in control group while starting unit. Ignoring.

Apr 29 17:24:09 raspberrypi systemd[1]: This usually indicates unclean termination of a previous run, or service implementation deficiencies.

[...]

Apr 29 17:24:09 raspberrypi systemd[1]: This usually indicates unclean termination of a previous run, or service implementation deficiencies.

Apr 29 17:24:09 raspberrypi systemd[1]: containerd.service: Found left-over process 4491 (containerd-shim) in control group while starting unit. Ignoring.

Apr 29 17:24:09 raspberrypi systemd[1]: This usually indicates unclean termination of a previous run, or service implementation deficiencies.

Apr 29 17:24:12 raspberrypi containerd[10055]: time="2022-04-29T17:24:12+02:00" level=warning msg="deprecated version : `1`, please switch to version `2`"

Apr 29 17:24:12 raspberrypi containerd[10055]: time="2022-04-29T17:24:12.585380941+02:00" level=info msg="starting containerd" revision=3df54a852345ae127d1fa3092b95168e4a88e2f8 version=1.5.11

Apr 29 17:24:12 raspberrypi containerd[10055]: time="2022-04-29T17:24:12.705751375+02:00" level=info msg="loading plugin \"io.containerd.content.v1.content\"..." type=io.containerd.content.v1

Apr 29 17:24:12 raspberrypi containerd[10055]: time="2022-04-29T17:24:12.706416827+02:00" level=info msg="loading plugin \"io.containerd.snapshotter.v1.aufs\"..." type=io.containerd.snapshotter.v1

Apr 29 17:24:12 raspberrypi containerd[10055]: time="2022-04-29T17:24:12.731533242+02:00" level=info msg="skip loading plugin \"io.containerd.snapshotter.v1.aufs\"..." error="aufs is not supported (modprobe aufs failed: exit status 1 \"modprobe: FATAL: Module aufs not found in directory /lib/modules/5.15.32-v8+\\n\"): skip plugin" type=io.containerd.snapshotter.v1

[...]

Apr 29 17:24:12 raspberrypi containerd[10055]: time="2022-04-29T17:24:12.771193998+02:00" level=info msg="loading plugin \"io.containerd.service.v1.namespaces-service\"..." type=io.containerd.service.v1

Apr 29 17:24:12 raspberrypi containerd[10055]: time="2022-04-29T17:24:12.771401569+02:00" level=info msg="loading plugin \"io.containerd.service.v1.snapshots-service\"..." type=io.containerd.service.v1

Apr 29 17:24:12 raspberrypi containerd[10055]: time="2022-04-29T17:24:12.771634102+02:00" level=info msg="loading plugin \"io.containerd.runtime.v1.linux\"..." type=io.containerd.runtime.v1

Apr 29 17:24:12 raspberrypi containerd[10055]: time="2022-04-29T17:24:12.771957745+02:00" level=info msg="loading plugin \"io.containerd.runtime.v2.task\"..." type=io.containerd.runtime.v2

Apr 29 17:24:12 raspberrypi containerd[10055]: panic: invalid page type: 4: 10

Apr 29 17:24:12 raspberrypi containerd[10055]: goroutine 1 [running]:

Apr 29 17:24:12 raspberrypi containerd[10055]: go.etcd.io/bbolt.(*Cursor).search(0x400059b598, {0x555ed17544, 0x6, 0x6}, 0x4)

Apr 29 17:24:12 raspberrypi containerd[10055]: #011/go/src/github.com/containerd/containerd/vendor/go.etcd.io/bbolt/cursor.go:250 +0x29c

Apr 29 17:24:12 raspberrypi containerd[10055]: go.etcd.io/bbolt.(*Cursor).searchPage(0x400059b598, {0x555ed17544, 0x6, 0x6}, 0x7f8815d000)

Apr 29 17:24:12 raspberrypi containerd[10055]: #011/go/src/github.com/containerd/containerd/vendor/go.etcd.io/bbolt/cursor.go:308 +0x140

Apr 29 17:24:12 raspberrypi containerd[10055]: go.etcd.io/bbolt.(*Cursor).search(0x400059b598, {0x555ed17544, 0x6, 0x6}, 0x8)

Apr 29 17:24:12 raspberrypi containerd[10055]: #011/go/src/github.com/containerd/containerd/vendor/go.etcd.io/bbolt/cursor.go:265 +0x1c0

Apr 29 17:24:12 raspberrypi containerd[10055]: go.etcd.io/bbolt.(*Cursor).seek(0x400059b598, {0x555ed17544, 0x6, 0x6})

Apr 29 17:24:12 raspberrypi containerd[10055]: #011/go/src/github.com/containerd/containerd/vendor/go.etcd.io/bbolt/cursor.go:159 +0x68

Apr 29 17:24:12 raspberrypi containerd[10055]: go.etcd.io/bbolt.(*Bucket).Bucket(0x4000444d80, {0x555ed17544, 0x6, 0x6})

[...]

Apr 29 17:24:12 raspberrypi containerd[10055]: github.com/containerd/containerd/cmd/containerd/command.App.func1(0x40003b89a0)

Apr 29 17:24:12 raspberrypi containerd[10055]: #011/go/src/github.com/containerd/containerd/cmd/containerd/command/main.go:179 +0x768

Apr 29 17:24:12 raspberrypi containerd[10055]: github.com/urfave/cli.HandleAction({0x555e1306a0, 0x555e32b5b0}, 0x40003b89a0)

Apr 29 17:24:12 raspberrypi containerd[10055]: #011/go/src/github.com/containerd/containerd/vendor/github.com/urfave/cli/app.go:523 +0xfc

Apr 29 17:24:12 raspberrypi containerd[10055]: github.com/urfave/cli.(*App).Run(0x400038ba40, {0x400003c1f0, 0x1, 0x1})

Apr 29 17:24:12 raspberrypi containerd[10055]: #011/go/src/github.com/containerd/containerd/vendor/github.com/urfave/cli/app.go:285 +0x62c

Apr 29 17:24:12 raspberrypi containerd[10055]: main.main()

Apr 29 17:24:12 raspberrypi containerd[10055]: #011github.com/containerd/containerd/cmd/containerd/main.go:33 +0x44

Apr 29 17:24:12 raspberrypi systemd[1]: containerd.service: Main process exited, code=exited, status=2/INVALIDARGUMENT

Apr 29 17:24:12 raspberrypi systemd[1]: containerd.service: Failed with result 'exit-code'.

Apr 29 17:24:12 raspberrypi systemd[1]: containerd.service: Unit process 2666 (containerd-shim) remains running after unit stopped.

Apr 29 17:24:12 raspberrypi systemd[1]: containerd.service: Unit process 2667 (containerd-shim) remains running after unit stopped.

Apr 29 17:24:12 raspberrypi systemd[1]: containerd.service: Unit process 2791 (containerd-shim) remains running after unit stopped.

Apr 29 17:24:12 raspberrypi systemd[1]: containerd.service: Unit process 2816 (containerd-shim) remains running after unit stopped.

Apr 29 17:24:12 raspberrypi systemd[1]: containerd.service: Unit process 2888 (containerd-shim) remains running after unit stopped.

Apr 29 17:24:12 raspberrypi systemd[1]: containerd.service: Unit process 2910 (containerd-shim) remains running after unit stopped.

Apr 29 17:24:12 raspberrypi systemd[1]: containerd.service: Unit process 2935 (containerd-shim) remains running after unit stopped.

Apr 29 17:24:12 raspberrypi systemd[1]: containerd.service: Unit process 3004 (containerd-shim) remains running after unit stopped.

Apr 29 17:24:12 raspberrypi systemd[1]: containerd.service: Unit process 4491 (containerd-shim) remains running after unit stopped.

Apr 29 17:24:12 raspberrypi systemd[1]: Failed to start containerd container runtime.

Apr 29 17:24:12 raspberrypi systemd[1]: Dependency failed for Docker Application Container Engine.

Apr 29 17:24:12 raspberrypi systemd[1]: docker.service: Job docker.service/start failed with result 'dependency'.

Apr 29 17:24:17 raspberrypi systemd[1]: containerd.service: Scheduled restart job, restart counter is at 2.

Apr 29 17:24:17 raspberrypi systemd[1]: Stopped containerd container runtime

If any of you fine folks had any hint to go forward… that’d be much appreciated :sweat_smile:

I think something is wrong with docker (or the whole system), I restarted my RPi and it does not want to start anymore it seems:

pac@raspberrypi:~/storj $ systemctl status docker
● docker.service - Docker Application Container Engine
     Loaded: loaded (/lib/systemd/system/docker.service; enabled; vendor preset: enabled)
     Active: inactive (dead) (Result: exit-code) since Fri 2022-04-29 18:18:29 CEST; 26s ago
TriggeredBy: ● docker.socket
       Docs: https://docs.docker.com
    Process: 798 ExecStart=/usr/bin/dockerd -H fd:// --containerd=/run/containerd/containerd.sock (code=exited, status=1/FAILURE)
   Main PID: 798 (code=exited, status=1/FAILURE)
        CPU: 373ms

Apr 29 18:18:32 raspberrypi systemd[1]: Dependency failed for Docker Application Container Engine.
Apr 29 18:18:32 raspberrypi systemd[1]: docker.service: Job docker.service/start failed with result 'dependency'.
Apr 29 18:18:37 raspberrypi systemd[1]: Dependency failed for Docker Application Container Engine.
Apr 29 18:18:37 raspberrypi systemd[1]: docker.service: Job docker.service/start failed with result 'dependency'.
Apr 29 18:18:42 raspberrypi systemd[1]: Dependency failed for Docker Application Container Engine.
Apr 29 18:18:42 raspberrypi systemd[1]: docker.service: Job docker.service/start failed with result 'dependency'.
Apr 29 18:18:48 raspberrypi systemd[1]: Dependency failed for Docker Application Container Engine.
Apr 29 18:18:48 raspberrypi systemd[1]: docker.service: Job docker.service/start failed with result 'dependency'.
Apr 29 18:18:53 raspberrypi systemd[1]: Dependency failed for Docker Application Container Engine.
Apr 29 18:18:53 raspberrypi systemd[1]: docker.service: Job docker.service/start failed with result 'dependency'.

So yeah, probably not related to STORJ at all. I guess I have a few days to figure out what’s up until my nodes get evicted for good… Wish me luck ^^’


pi@raspberrypi:~ $ systemctl status docker
● docker.service - Docker Application Container Engine
     Loaded: loaded (/lib/systemd/system/docker.service; enabled; vendor preset: enabled)
     Active: active (running) since Fri 2022-04-29 18:17:36 BST; 2min 42s ago
TriggeredBy: ● docker.socket
       Docs: https://docs.docker.com
   Main PID: 633 (dockerd)
      Tasks: 9
        CPU: 1.106s
     CGroup: /system.slice/docker.service
             └─633 /usr/bin/dockerd -H fd:// --containerd=/run/containerd/containerd.sock

apr 29 18:17:30 raspberrypi dockerd[633]: time="2022-04-29T18:17:30.653967919+01:00" level=info msg="ClientConn switching balancer to \"pick_first\"" module=grpc
apr 29 18:17:33 raspberrypi dockerd[633]: time="2022-04-29T18:17:33.559067695+01:00" level=info msg="[graphdriver] using prior storage driver: overlay2"
apr 29 18:17:33 raspberrypi dockerd[633]: time="2022-04-29T18:17:33.780593473+01:00" level=warning msg="Unable to find memory controller"
apr 29 18:17:33 raspberrypi dockerd[633]: time="2022-04-29T18:17:33.781172010+01:00" level=info msg="Loading containers: start."
apr 29 18:17:35 raspberrypi dockerd[633]: time="2022-04-29T18:17:35.094639342+01:00" level=info msg="Default bridge (docker0) is assigned with an IP address 172.17.0.0/16. Daemon option --bip can be used to>
apr 29 18:17:35 raspberrypi dockerd[633]: time="2022-04-29T18:17:35.372008305+01:00" level=info msg="Loading containers: done."
apr 29 18:17:36 raspberrypi dockerd[633]: time="2022-04-29T18:17:36.852834786+01:00" level=info msg="Docker daemon" commit=87a90dc graphdriver(s)=overlay2 version=20.10.14
apr 29 18:17:36 raspberrypi dockerd[633]: time="2022-04-29T18:17:36.854742564+01:00" level=info msg="Daemon has completed initialization"
apr 29 18:17:36 raspberrypi systemd[1]: Started Docker Application Container Engine.
apr 29 18:17:37 raspberrypi dockerd[633]: time="2022-04-29T18:17:37.001955063+01:00" level=info msg="API listen on /run/docker.sock"
lines 1-21/21 (END)


I just bought a used RPI4 with 4GB RAM. What happens with you type "top".
1 Like

Hi Pac,

Where did you get docker from ?

What’s the output of uname -a

What’s docker version

From those errors, I’m assuming you on kernel 5.15.32+ and you pulled docker from official docker site ?

I’ve had a complete nightmare with docker official, so I’ve reverted to Raspbian official docker release at .10+df - it’s good. You might find that if you might need to also revert docker version.

1 Like

I havent seen anything like this but im also not running raspi OS im running ubuntu server 64bit

1 Like

Maybe you should exchange the SD card. I experienced similar weird behaviour in the past when the SD card was dying.

2 Likes

Well, right now not much because nothing’s running on the RPi, as even docker is KO:

top - 23:44:44 up  5:27,  1 user,  load average: 0.14, 0.27, 0.26
Tasks: 153 total,   2 running, 151 sleeping,   0 stopped,   0 zombie
%Cpu(s):  1.4 us,  1.2 sy,  0.0 ni, 97.4 id,  0.0 wa,  0.0 hi,  0.0 si,  0.0 st
MiB Mem :   3844.4 total,   1884.5 free,     81.2 used,   1878.7 buff/cache
MiB Swap:      0.0 total,      0.0 free,      0.0 used.   3686.8 avail Mem

But usually these days, the load average was around 3 (+ ~20% of iowait) because my SMR disks are having a hard time to keep up, and memory usage was around 600MB.


Got docker like so:

curl -fsSL https://get.docker.com -o get-docker.sh
sudo sh get-docker.sh

Other info:

pac@raspberrypi:~ $ uname -a
Linux raspberrypi 5.15.32-v8+ #1538 SMP PREEMPT Thu Mar 31 19:40:39 BST 2022 aarch64 GNU/Linux

pac@raspberrypi:~ $ docker version
Client: Docker Engine - Community
 Version:           20.10.14
 API version:       1.41
 Go version:        go1.16.15
 Git commit:        a224086
 Built:             Thu Mar 24 01:47:24 2022
 OS/Arch:           linux/arm64
 Context:           default
 Experimental:      true
Got permission denied while trying to connect to the Docker daemon socket at unix:///var/run/docker.sock: Get "http://%2Fvar%2Frun%2Fdocker.sock/v1.24/version": dial unix /var/run/docker.sock: connect: permission denied

Running sudo docker version fails (hangs for ever) because docker’s broken right now.


Maybe I should try out ubuntu: Is your RPi 4B stable with Ubuntu server?


The RPi boots from a 1.8" HDD right now, but it used to be on an SD card. Corrupted files could definitely be the root cause though, I’ll start a full check on all disks right now.

How did you do that precisely? :slight_smile:

Hi Pac,

So Raspberry Pi OS 64 from your uname is NOT compatible with docker.com latest currently, the command you ran would have bricked the Pi OS - you could try a apt remove docker.io, then remove the docker.com repo’s from the /etc/apt/ list but I think the damage is done.

so;

quickest thing to do would be to reburn the image using RPI imager - I use other OS, than Raspberry pi 64 lite, downloading the tool from

https://www.raspberrypi.com/software/

Then boot…

apt update
apt install chrony     <--- better NTP deamon, but up to you
apt install lvm2
apt upgrade
rpi-eeprom-update -d -a  <- this will flash latest firmware, you would be amazed how many people still on really old versions :D
reboot

apt install docker.io

optional…

you can manually set IP address editing /etc/dhcpcd.conf

you can disable ipv6 editing /etc/sysctl.conf and adding to end

#Disable IPv6
net.ipv6.conf.all.disable_ipv6 = 1
net.ipv6.conf.default.disable_ipv6 = 1
net.ipv6.conf.lo.disable_ipv6 = 1
  • That should be basics

when you do docker version now, you will be on +dfsg1 version - reconnect you USB disks, mount the Storj data and identity and fire up the nodes - should all work fine now ^^

– hope that helps

You can always burn Ubuntu 20.04 LTS rapi image - it works ok… but kernel is old, and many fixes from upstream missing… still on 5.4 and even with hardware enablement stack switched on there is no RPI 5.10 kernel on the 20.04 LTS (or wasn’t last time I checked :smiley: )

2 Likes

Yes it’s been rock solid think around 100days uptime now.
image

1 Like

@CutieePie Thank you for your extensive reply. I shall try that! :slight_smile: :+1:

Why do I want LVM?

Because LVM2 combined with an SSD drive, and a SMR drive using LVMcache set to 1GB reduces filewalker to ~ 40mins per TB instead of 4-5 hours per TB. I’ve been testing for the last 6 months and on a 1GB LVMcache (I’ve tried various sizes but this is optimal) I see around a 50% hit on read / write…

I keep meaning to post a guide…

That seems like a bit overkill to run on a rpi4 though. Ive been running 2 SMR drives for over 3 years and once there full they really dont have alot of issues. It 100% hasnt taken 4-5 hours per TB for file walker for me ever.
image


You can just look at my succesrate for my rpi4 on a smr drive.

Tried that, but it seems there’s still something terribly wrong with docker on my setup (I did not try to reinstall the whole machine yet, I’m accessing my RPi remotely for now… I’ll have access to it in 1-2 days). The docker daemon is stuck at the “activating” stage:

pac@raspberrypi:~ $ systemctl status docker.service
● docker.service - Docker Application Container Engine
     Loaded: loaded (/lib/systemd/system/docker.service; enabled; vendor preset: enabled)
     Active: activating (start) since Sat 2022-04-30 23:49:11 CEST; 48s ago
TriggeredBy: ● docker.socket
       Docs: https://docs.docker.com
   Main PID: 1090 (dockerd)
      Tasks: 9
        CPU: 218ms
     CGroup: /system.slice/docker.service
             └─1090 /usr/sbin/dockerd -H fd://

Apr 30 23:49:27 raspberrypi dockerd[1090]: time="2022-04-30T23:49:27.632019550+02:00" level=warning msg="grpc: addrConn.createTransport failed to connect to {unix:///run/containerd/containerd.sock  <nil> 0 <nil>}. Err :connection error:>
Apr 30 23:49:30 raspberrypi dockerd[1090]: time="2022-04-30T23:49:30.702200193+02:00" level=warning msg="grpc: addrConn.createTransport failed to connect to {unix:///run/containerd/containerd.sock  <nil> 0 <nil>}. Err :connection error:>
Apr 30 23:49:33 raspberrypi dockerd[1090]: time="2022-04-30T23:49:33.333135009+02:00" level=warning msg="grpc: addrConn.createTransport failed to connect to {unix:///run/containerd/containerd.sock  <nil> 0 <nil>}. Err :connection error:>
Apr 30 23:49:36 raspberrypi dockerd[1090]: time="2022-04-30T23:49:36.649769825+02:00" level=warning msg="grpc: addrConn.createTransport failed to connect to {unix:///run/containerd/containerd.sock  <nil> 0 <nil>}. Err :connection error:>
Apr 30 23:49:39 raspberrypi dockerd[1090]: time="2022-04-30T23:49:39.728018387+02:00" level=warning msg="grpc: addrConn.createTransport failed to connect to {unix:///run/containerd/containerd.sock  <nil> 0 <nil>}. Err :connection error:>
Apr 30 23:49:46 raspberrypi dockerd[1090]: time="2022-04-30T23:49:46.684254644+02:00" level=warning msg="grpc: addrConn.createTransport failed to connect to {unix:///run/containerd/containerd.sock  <nil> 0 <nil>}. Err :connection error:>
Apr 30 23:49:49 raspberrypi dockerd[1090]: time="2022-04-30T23:49:49.545991129+02:00" level=warning msg="grpc: addrConn.createTransport failed to connect to {unix:///run/containerd/containerd.sock  <nil> 0 <nil>}. Err :connection error:>
Apr 30 23:49:52 raspberrypi dockerd[1090]: time="2022-04-30T23:49:52.999173128+02:00" level=warning msg="grpc: addrConn.createTransport failed to connect to {unix:///run/containerd/containerd.sock  <nil> 0 <nil>}. Err :connection error:>
Apr 30 23:49:55 raspberrypi dockerd[1090]: time="2022-04-30T23:49:55.876514475+02:00" level=warning msg="grpc: addrConn.createTransport failed to connect to {unix:///run/containerd/containerd.sock  <nil> 0 <nil>}. Err :connection error:>
Apr 30 23:49:59 raspberrypi dockerd[1090]: time="2022-04-30T23:49:59.002414666+02:00" level=warning msg="grpc: addrConn.createTransport failed to connect to {unix:///run/containerd/containerd.sock  <nil> 0 <nil>}. Err :connection error:>

All docker commands that I issue just hang forever with no effect (sudo docker ps for instance).

I’ll probably have to reinstall everything. That’s so weird. Maybe my /boot disk is failing, I dunno :confused:

(All my Storj disks seem OK though, checked all of them with badblock and fsck)

ok :frowning:

If you not rebuild then you can remove docker official, then clean apt which will make it re-attach to Raspberry pi os branch,

apt remove docker docker-engine docker.io containerd runc

** Be carful with this, you need delete all docker folder under /var/lib only
rm -rf /var/lib/docker/
rm -rf /var/lib/containerd/

mv /etc/apt/sources.list.d/docker.list ~/docker.list.old

apt update

apt clean

apt install docker.io

1 Like

Did all that. Here is what it says when installing docker.io:

[...]
Setting up runc (1.0.0~rc93+ds1-5+b2) ...
Setting up containerd (1.4.13~ds1-1~deb11u1) ...
Setting up docker.io (20.10.5+dfsg1-1+deb11u1) ...
Job for docker.service failed because the control process exited with error code.
See "systemctl status docker.service" and "journalctl -xe" for details.
Processing triggers for man-db (2.9.4-2) ...
Scanning processes...                                                        Scanning processor microcode...                                              Scanning linux images...
[...]

So, as instructed I type systemctl status docker.service and this time the service does appear as “active”, but shows the same errors.

I tried to start one of my nodes anyway just in case, and here is the result:

docker: Error response from daemon: error creating temporary lease: connection error: desc = "transport: Error while dialing dial unix /run/containerd/containerd.sock: connect: connection refused": unavailable.
See 'docker run --help'

It’s late where I am… I’ll keep investigating later, look up errors on the Internet etc. and probably reinstall the whole thing from scratch.
Thanks for your support in any case :slight_smile:

Just came accross the following and decided to give it a go before going to sleep:

@CutieePie I followed your steps again, with the additional command from the above link:
sudo rm -fr /var/lib/containerd/

Now docker is finally up and running correctly!
I restarted all my nodes, so far so good… It’s way too soon to tell whether the pb is solved or not, only time will tell ^^’

I’ll report back later!

4 Likes

This is good to hear :slight_smile: I’ve edited my post to add your update, very weird containerd needed delete as well, such is the mysterys of docker :nerd_face:

Pleased your nodes are back online now.

CP

2 Likes

Follow up: after fixing docker, my nodes where back online, and I took the time to make a clean and fresh install of Raspbian 64 on the previously used SD card.

I re-setup all my nodes, and they have been up and running without an issue for 3+ days now. My guts (+ some suspicious I/O related syslogs and hints on some forums… aaaand @striker43 :wink:) tell me that the issues I faced are likely to have been related to OS disk failures, which can lead to many random and incongruous problems…

So I decided to re-install the OS soon on a more robust and dedicated SSD when I have some spare time, as experience (and The Internet) shows that SDs are definitely not a good long-term solution which eventually makes the whole system fail.

Thank you all for your ideas and support! :smiley: :+1:

2 Likes