Node turns off, QUIC misconfigured and restart

Hi there! i’m getting some trouble on one node! I’m getting QUIC misconfigured from time to time! It only happens since the instalation of the new STORJ version, and after sometime it goes offline !! There’s my confs

sudo docker run -d --restart always --stop-timeout 300 -p 28968:28967/tcp -p 28968:28967/udp -p 127.0.0.1:14003:14002

Here are some picks of the node and router port forwarding

Running some commands here… On the pi i get this…


My question is : should UDP ports be “listening”? If so which should be open (listening)?

wich linux did you use for the pi?

storagewars@raspberrypi:~ $ uname -a
Linux raspberrypi 6.1.21-v8+ #1642 SMP PREEMPT Mon Apr 3 17:24:16 BST 2023 aarch64 GNU/Linux

maybe this helps with quic?

regarding the restart of the node, try to check the logs for the fatal error.

I have a dinamic ip! i’m reading your post, and looks like the server you are mentioning has a static one! i’m going to dig deep!

When i list all udp ports with command ss -lntu i get all udp ports unconnected! Should stay that way?

I can only pass that question to @Alexey

Maybe @arrogantrabbit can help here a bit?

Funny is that i was searching in the wrong log… The correct one shouws this: 2023-11-09T11:13:43Z WARN collector file does not exist {"process": "storagenode", "Satellite ID": "12EayRS2V1kEsWESU9QMRseFhdxYxKicsiFmxrsLZHeLUtdps3S", "Piece ID": "I3TMUORF3G5FXM2NURZLESQU46C7WPRNOHB5YVOBIAKJ64NKDSDA"} 2023-11-09T11:13:43Z ERROR collector unable to delete expired piece info from DB {"process": "storagenode", "Satellite ID": "12EayRS2V1kEsWESU9QMRseFhdxYxKicsiFmxrsLZHeLUtdps3S", "Piece ID": "I3TMUORF3G5FXM2NURZLESQU46C7WPRNOHB5YVOBIAKJ64NKDSDA", "error": "pieces error: database disk image is malformed", "errorVerbose": "pieces error: database disk image is malformed\n\tstorj.io/storj/storagenode/pieces.(*Store).DeleteExpired:365\n\tstorj.io/storj/storagenode/collector.(*Service).Collect:101\n\tstorj.io/storj/storagenode/collector.(*Service).Run.func1:57\n\tstorj.io/common/sync2.(*Cycle).Run:99\n\tstorj.io/storj/storagenode/collector.(*Service).Run:53\n\tstorj.io/storj/private/lifecycle.(*Group).Run.func2.1:87\n\truntime/pprof.Do:51\n\tstorj.io/storj/private/lifecycle.(*Group).Run.func2:86\n\tgolang.org/x/sync/errgroup.(*Group).Go.func1:75"}
One example…

I might get a corrupt file, going to make a health check!

or then you can use

Working! Health checker was a superhero… :sweat_smile:

Thank you!

You should ignore misconfigured quic.

2 Likes

Does QUIC work after you fixed a filesystem and the database?

24 h without a break! i think i can say that’s fixed!

2 Likes

Hi… long time no see! One disk is broken “ticking needdle”! Had to replace it… all info gone on that drive! Now Node keeps rebooting! Here is the log

Did the drive contain data? If you lost a significant amount of data you will likely want to start over with a new identity and install.

For the cache error

For the case of docker: stop and remove the container, delete the file trust-cache.json from the data location, run the container back.
However, if

then you need to start over: remove the old identity, clean the disk from node’s data, generate a new identity (identity create storagenode), sign it with a new authorization token and setup the node: Storage Node - Storj Docs

1 Like

The disk is empty now! Where do i find trust-cache.json for that disk? i’m using find -name trust-cache.json (found it on the good one, will leave it as it is) . I’m creating a new identity! it will take a while!

Every time when you lost data or clean the disk you should start over.
Did you generate a new identity after you cleaned the disk?
If not - you must remove the current identity and generate a new one, then sign it with a new authorization token and setup the node.

This file will be downloaded automatically when you start the node. For docker version it will be placed near config.yaml in the data location. But since it will be downloaded, you should not remove it unless you see the error related to it.

I have a typo here… let me screenshoot this!


How do i solve it?