Node turns off, QUIC misconfigured and restart

toma · November 9, 2023, 9:46am

Hi there! i’m getting some trouble on one node! I’m getting QUIC misconfigured from time to time! It only happens since the instalation of the new STORJ version, and after sometime it goes offline !! There’s my confs

sudo docker run -d --restart always --stop-timeout 300 -p 28968:28967/tcp -p 28968:28967/udp -p 127.0.0.1:14003:14002

Here are some picks of the node and router port forwarding

Running some commands here… On the pi i get this…

My question is : should UDP ports be “listening”? If so which should be open (listening)?

daki82 · November 9, 2023, 10:26am

wich linux did you use for the pi?

toma · November 9, 2023, 10:29am

storagewars@raspberrypi:~ $ uname -a
Linux raspberrypi 6.1.21-v8+ #1642 SMP PREEMPT Mon Apr 3 17:24:16 BST 2023 aarch64 GNU/Linux

daki82 · November 9, 2023, 10:40am

maybe this helps with quic?

regarding the restart of the node, try to check the logs for the fatal error.

toma · November 9, 2023, 11:01am

I have a dinamic ip! i’m reading your post, and looks like the server you are mentioning has a static one! i’m going to dig deep!

toma · November 9, 2023, 11:11am

When i list all udp ports with command ss -lntu i get all udp ports unconnected! Should stay that way?

daki82 · November 9, 2023, 11:18am

I can only pass that question to @Alexey

daki82 · November 9, 2023, 11:22am

Maybe @arrogantrabbit can help here a bit?

toma · November 9, 2023, 11:26am

Funny is that i was searching in the wrong log… The correct one shouws this: 2023-11-09T11:13:43Z WARN collector file does not exist {"process": "storagenode", "Satellite ID": "12EayRS2V1kEsWESU9QMRseFhdxYxKicsiFmxrsLZHeLUtdps3S", "Piece ID": "I3TMUORF3G5FXM2NURZLESQU46C7WPRNOHB5YVOBIAKJ64NKDSDA"} 2023-11-09T11:13:43Z ERROR collector unable to delete expired piece info from DB {"process": "storagenode", "Satellite ID": "12EayRS2V1kEsWESU9QMRseFhdxYxKicsiFmxrsLZHeLUtdps3S", "Piece ID": "I3TMUORF3G5FXM2NURZLESQU46C7WPRNOHB5YVOBIAKJ64NKDSDA", "error": "pieces error: database disk image is malformed", "errorVerbose": "pieces error: database disk image is malformed\n\tstorj.io/storj/storagenode/pieces.(*Store).DeleteExpired:365\n\tstorj.io/storj/storagenode/collector.(*Service).Collect:101\n\tstorj.io/storj/storagenode/collector.(*Service).Run.func1:57\n\tstorj.io/common/sync2.(*Cycle).Run:99\n\tstorj.io/storj/storagenode/collector.(*Service).Run:53\n\tstorj.io/storj/private/lifecycle.(*Group).Run.func2.1:87\n\truntime/pprof.Do:51\n\tstorj.io/storj/private/lifecycle.(*Group).Run.func2:86\n\tgolang.org/x/sync/errgroup.(*Group).Go.func1:75"}
One example…

I might get a corrupt file, going to make a health check!

daki82 · November 9, 2023, 11:30am

or then you can use

toma · November 9, 2023, 11:53am

Working! Health checker was a superhero…

Thank you!

arrogantrabbit · November 9, 2023, 4:01pm

You should ignore misconfigured quic.

Alexey · November 10, 2023, 4:18am

Does QUIC work after you fixed a filesystem and the database?

toma · November 10, 2023, 10:56am

24 h without a break! i think i can say that’s fixed!

toma · January 16, 2024, 8:26pm

Hi… long time no see! One disk is broken “ticking needdle”! Had to replace it… all info gone on that drive! Now Node keeps rebooting! Here is the log

Knowledge · January 16, 2024, 8:39pm

Did the drive contain data? If you lost a significant amount of data you will likely want to start over with a new identity and install.

For the cache error

Alexey · January 17, 2024, 2:40am

For the case of docker: stop and remove the container, delete the file trust-cache.json from the data location, run the container back.
However, if

then you need to start over: remove the old identity, clean the disk from node’s data, generate a new identity (identity create storagenode), sign it with a new authorization token and setup the node: Storage Node - Storj Docs

toma · January 17, 2024, 7:31pm

The disk is empty now! Where do i find trust-cache.json for that disk? i’m using find -name trust-cache.json (found it on the good one, will leave it as it is) . I’m creating a new identity! it will take a while!

Alexey · January 18, 2024, 3:43am

Every time when you lost data or clean the disk you should start over.
Did you generate a new identity after you cleaned the disk?
If not - you must remove the current identity and generate a new one, then sign it with a new authorization token and setup the node.

This file will be downloaded automatically when you start the node. For docker version it will be placed near config.yaml in the data location. But since it will be downloaded, you should not remove it unless you see the error related to it.

toma · January 18, 2024, 5:00pm

I have a typo here… let me screenshoot this!

How do i solve it?