Node randomly died

Milo123459 · April 2, 2023, 12:37pm

I keep getting this error after restarting my Raspberry Pi, which is running 2 nodes:
2023-04-02T12:36:46.894213513Z 2023-04-02T12:36:46.893Z ERROR contact:service ping satellite failed {“Process”: “storagenode”, “Satellite ID”: “12rfG3sh9NCWiX3ivPjq2HtdLmbqCrvHVEzJubnzFzosMuawymB”, “attempts”: 6, “error”: “ping satellite: check-in ratelimit: node rate limited by id”, “errorVerbose”: “ping satellite: check-in ratelimit: node rate limited by id\n\tstorj.io/storj/storagenode/contact.(*Service).pingSatelliteOnce:141\n\tstorj.io/storj/storagenode/contact.(*Service).pingSatellite:101\n\tstorj.io/storj/storagenode/contact.(*Chore).updateCycles.func1:87\n\tstorj.io/common/sync2.(*Cycle).Run:99\n\tstorj.io/common/sync2.(*Cycle).Start.func1:77\n\tgolang.org/x/sync/errgroup.(*Group).Go.func1:75”}

revyte · April 2, 2023, 12:40pm

You are rate-limited, this happens if you restart too often in a short period of time. Should fix itself after some minutes.

“ping satellite: check-in ratelimit: node rate limited by id”, “errorVerbose”: “ping satellite: check-in ratelimit: node rate limited by

Milo123459 · April 2, 2023, 12:41pm

Well, the thing is, this just randomly happened and the node had 140 hours of uptime…

Milo123459 · April 2, 2023, 12:44pm

After some digging, my internet router has decided to factory reset itself on its own. I’ve added the port forwarding rules back, so sorry for making a post. I didn’t expect it to happen randomly.

Edit: I’m still being ratelimited. How long should I expect for this to last?

Alexey · April 2, 2023, 12:48pm

As @revyte said - if your node is trying to check-in on the satellite too often - it will be throttled.
However, the frequent restart not only one of the reasons, more often it’s happen if your node is offline, i.e. your external address got changed, but node trying to check-in with a old one, and satellite is not able to contact the node.
So please search for a previous error messages like “ping satellite failed”, this will indicate that you node is not reachable from the outside.

In this case you need to check your ADDRESS option, if you used an IP, perhaps you need to update it, since your ISP has changed it to a new one or use a DDNS hostname instead of IP.
If you use an DDNS hostname, make sure that updater is configured on your router or the DDNS updater application is running. If they are - check that your DDNS subscription is not expired.

Milo123459 · April 2, 2023, 12:49pm

Yeah, thank you. I’ve managed to recover one of the nodes and get QUIC working again, but applying the same configuration to another node and port forwarding isn’t working for that one. Not sure what to do but thank you for the help!

Alexey · April 2, 2023, 12:49pm

Please post your port forwarding rules here for both nodes and your docker run command (you may mask the private info).

Milo123459 · April 2, 2023, 12:54pm

That is the screenshot of the port forwarding rules.
Open Port Check Tool - Test Port Forwarding on Your Router I have checked on here and both ports are forwarded.

That is Node 1, which has QUIC misconfigured (28967)

That is Node 2, which as QUIC configured (28968)

The docker-compose file is not the issue, as I can see that the port is being forwarded with both protocols:
- 28967:28967/tcp
- 28967:28967/udp

There are no errors in the logs for storagenode1 either.

Milo123459 · April 2, 2023, 1:20pm

It seems to be fine now. I’m not sure what’s going on, but I can tell you that this is unbelievably confusing lol

Alexey · April 2, 2023, 1:35pm

The satellite will check UDP QUIC availability on each check-in of the node (every hour by default), so now it seems working.

Milo123459 · April 2, 2023, 7:01pm

Didn’t know, thanks for informing me! PS: All my nodes are good now.