Yep, it seams to be so, there is only one reason what should be discovered with the first node and regular restartsβ¦
today some new kind of error like
|2024-06-30T11:25:33Z|ERROR|services|unexpected shutdown of a runner|{Process: storagenode, name: piecestore:monitor, error: piecestore monitor: timed out after 1m0s while verifying writability of storage directory, errorVerbose: piecestore monitor: timed out after 1m0s while verifying writability of storage directory\n\tstorj.io/storj/storagenode/monitor.(*Service).Run.func2.1:175\n\tstorj.io/common/sync2.(*Cycle).Run:160\n\tstorj.io/storj/storagenode/monitor.(*Service).Run.func2:164\n\tgolang.org/x/sync/errgroup.(*Group).Go.func1:78}|
|---|---|---|---|---|
|2024-06-30T11:27:16Z|ERROR|piecestore|upload failed|{Process: storagenode, Piece ID: SNAJ57UF4APURAIB2U4K6I7Q5W3GBKDR4ODWD6SOBX5CYTOLQIMA, Satellite ID: 12L9ZFwhzVpuEKMUNUqkaTLGzwY9G24tbiigLiXpmZWKwmcNDDs, Action: PUT, Remote Address: 79.127.226.102:43904, Size: 1048576, error: manager closed: unexpected EOF, errorVerbose: manager closed: unexpected EOF\n\tgithub.com/jtolio/noiseconn.(*Conn).readMsg:225\n\tgithub.com/jtolio/noiseconn.(*Conn).Read:171\n\tstorj.io/drpc/drpcwire.(*Reader).read:68\n\tstorj.io/drpc/drpcwire.(*Reader).ReadPacketUsing:113\n\tstorj.io/drpc/drpcmanager.(*Manager).manageReader:229}|
it is ok if it slow, but why it shutting down and restart again
docker events shows βdieβ
admin@DATAHUB:/volume1/home/admin $ sudo docker events --since 2024-06-30
2024-06-30T11:03:55.610637494+03:00 network connect 0cccdaa13d5b7ca1e9fc45bc65f73bede06f9935df0b1d1b838d9443831319da (container=8acefa9f583e559a98909e0c26f8c462f594a3e61f46e46e4a3212472c860905, name=bridge, type=bridge)
2024-06-30T11:03:57.239151055+03:00 container start 8acefa9f583e559a98909e0c26f8c462f594a3e61f46e46e4a3212472c860905 (image=storjlabs/watchtower, io.storj.watchtower=true, name=watchtower)
2024-06-30T13:05:32.404215804+03:00 network connect 0cccdaa13d5b7ca1e9fc45bc65f73bede06f9935df0b1d1b838d9443831319da (container=aa554f32a1de09840c43fc66fc85cada077d397a8d4ab24e69667acacaf2bfda, name=bridge, type=bridge)
2024-06-30T13:05:34.969124229+03:00 container start aa554f32a1de09840c43fc66fc85cada077d397a8d4ab24e69667acacaf2bfda (image=storjlabs/storagenode:latest, name=storagenode)
2024-06-30T13:05:34.969146886+03:00 container restart aa554f32a1de09840c43fc66fc85cada077d397a8d4ab24e69667acacaf2bfda (image=storjlabs/storagenode:latest, name=storagenode)
2024-06-30T13:36:47.003787291+03:00 network disconnect 0cccdaa13d5b7ca1e9fc45bc65f73bede06f9935df0b1d1b838d9443831319da (container=aa554f32a1de09840c43fc66fc85cada077d397a8d4ab24e69667acacaf2bfda, name=bridge, type=bridge)
2024-06-30T13:36:47.120687588+03:00 container die aa554f32a1de09840c43fc66fc85cada077d397a8d4ab24e69667acacaf2bfda (execDuration=1870, exitCode=0, image=storjlabs/storagenode:latest, name=storagenode)
2024-06-30T13:36:48.095620100+03:00 network connect 0cccdaa13d5b7ca1e9fc45bc65f73bede06f9935df0b1d1b838d9443831319da (container=aa554f32a1de09840c43fc66fc85cada077d397a8d4ab24e69667acacaf2bfda, name=bridge, type=bridge)
2024-06-30T13:36:49.628211361+03:00 container start aa554f32a1de09840c43fc66fc85cada077d397a8d4ab24e69667acacaf2bfda (image=storjlabs/storagenode:latest, name=storagenode)
2024-06-30T14:09:17.465333210+03:00 network disconnect 0cccdaa13d5b7ca1e9fc45bc65f73bede06f9935df0b1d1b838d9443831319da (container=aa554f32a1de09840c43fc66fc85cada077d397a8d4ab24e69667acacaf2bfda, name=bridge, type=bridge)
2024-06-30T14:09:17.615534404+03:00 container die aa554f32a1de09840c43fc66fc85cada077d397a8d4ab24e69667acacaf2bfda (execDuration=1946, exitCode=0, image=storjlabs/storagenode:latest, name=storagenode)
2024-06-30T14:09:18.582641937+03:00 network connect 0cccdaa13d5b7ca1e9fc45bc65f73bede06f9935df0b1d1b838d9443831319da (container=aa554f32a1de09840c43fc66fc85cada077d397a8d4ab24e69667acacaf2bfda, name=bridge, type=bridge)
2024-06-30T14:09:19.823206505+03:00 container start aa554f32a1de09840c43fc66fc85cada077d397a8d4ab24e69667acacaf2bfda (image=storjlabs/storagenode:latest, name=storagenode)
2024-06-30T14:29:38.561572544+03:00 network disconnect 0cccdaa13d5b7ca1e9fc45bc65f73bede06f9935df0b1d1b838d9443831319da (container=aa554f32a1de09840c43fc66fc85cada077d397a8d4ab24e69667acacaf2bfda, name=bridge, type=bridge)
2024-06-30T14:29:38.712520405+03:00 container die aa554f32a1de09840c43fc66fc85cada077d397a8d4ab24e69667acacaf2bfda (execDuration=1217, exitCode=0, image=storjlabs/storagenode:latest, name=storagenode)
2024-06-30T14:29:39.748786389+03:00 network connect 0cccdaa13d5b7ca1e9fc45bc65f73bede06f9935df0b1d1b838d9443831319da (container=aa554f32a1de09840c43fc66fc85cada077d397a8d4ab24e69667acacaf2bfda, name=bridge, type=bridge)
2024-06-30T14:29:40.770352145+03:00 container start aa554f32a1de09840c43fc66fc85cada077d397a8d4ab24e69667acacaf2bfda (image=storjlabs/storagenode:latest, name=storagenode)
and some uniq -c logs inside:
| 5 Version is up to date|{Process: storagenode-updater, Service: storagenode}|
|---|---|
| 5 Version is up to date|{Process: storagenode-updater, Service: storagenode-updater}|
| 2 bandwidth|Persisting bandwidth usage cache to db|
| 2 collector|collect|
| 2 collector|error during collecting pieces: |
| 2 db.migration|Database Version|
| 2 failure during run|{Process: storagenode, error: piecestore monitor: timed out after 1m0s while verifying writability of storage directory, errorVerbose: piecestore monitor: timed out after 1m0s while verifying writability of storage directory\n\tstorj.io/storj/storagenode/monitor.(*Service).Run.func2.1:175\n\tstorj.io/common/sync2.(*Cycle).Run:160\n\tstorj.io/storj/storagenode/monitor.(*Service).Run.func2:164\n\tgolang.org/x/sync/errgroup.(*Group).Go.func1:78}|
| 1 gracefulexit:chore|error retrieving satellites.|
| 8 pieces:trash|emptying trash finished|
| 8 pieces:trash|emptying trash started|
| 753 piecestore|download canceled|
| 17 piecestore|download failed|
| 2452 piecestore|download started|
| 1694 piecestore|downloaded|
| 35 piecestore|error sending hash and order limit|
| 482 piecestore|upload canceled|
| 9536 piecestore|upload canceled (race lost or node shutdown)|
| 12 piecestore|upload failed|
| 41159 piecestore|upload started|
| 31608 piecestore|uploaded|
| 1 piecestore:cache|error getting current used space: |
| 2 piecestore:monitor|Disk space is less than requested. Allocated space is|
| 2 preflight:localtime|local system clock is in sync with trusted satellites' system clock.|
| 2 preflight:localtime|start checking local system clock with trusted satellites' system clock.|
| 8 reputation:service|node scores updated|
| 2 retain|Prepared to run a Retain request.|
| 2 retain|retain pieces failed|
| 2 server|enable with: sysctl -w net.ipv4.tcp_fastopen=3|
| 2 server|kernel support for server-side tcp fast open remains disabled.|
| 1 servers|service takes long to shutdown|
| 1 servers|slow shutdown|
| 3 services|service takes long to shutdown|
| 1 services|slow shutdown|
| 1 services|unexpected shutdown of a runner|
| 2 trust|Scheduling next refresh|