Nodes going offline periodically

This week saw issues with different nodes going offline at different times. Sometimes it is a windows GUI node and sometimes an Ubuntu docker node. I have 6 nodes and each has had the problem at least once this week. Some more than once. Restarting the router appears to fix the problem even though the network shows there is internet access. Until this week my nodes have been fine for months to years. Now I must check them several times a day. Is there a problem with my router? It is not quite 5 years old. what should I check?

Internet is 300 BY 20
Netgear N300 WIFI Gigabit Router WNR3500L

Do you have any log entries when you search “ping”?

I have had a few wierd connectivity issues that were the fault of ddns. I grabbed a cloudflare ddns docker container and used that and it fixed it for me.

just a though, good luck!

1 Like

PCs were pingable at the time of the offline so I don’t think it is was a PC issue.
I use NO-IP DDNS. IT worked fine in the past.

I could try switching to a static IP as my ISP has not changed my public IP for 7 months.

He doesn’t mean ping from pc to pc but rather ping from satellite. Check your log for the keyword “ping”.

1 Like

Just a bunch of "ping satellite failed Attempts 8 error messages.

Please search for “ping satellite failed” except “rate limit”, you will see a reason, why your node is considered as offline.

They all look like this:
2023-08-03T18:52:45-05:00 ERROR contact:service ping satellite failed {“Satellite ID”: “12tRQrMTWUWwzwGh18i7Fqs67kmdhH9t6aToeiwbo5mfS2rUmo”, “attempts”: 3, “error”: “ping satellite: failed to ping storage node, your node indicated error code: 0, rpc: tcp connector failed: rpc: dial tcp 149.75.178.72:46013: connect: connection timed out”, “errorVerbose”: “ping satellite: failed to ping storage node, your node indicated error code: 0, rpc: tcp connector failed: rpc: dial tcp 149.75.178.72:46013: connect: connection timed out\n\tstorj.io/storj/storagenode/contact.(*Service).pingSatelliteOnce:209\n\tstorj.io/storj/storagenode/contact.(*Service).pingSatellite:157\n\tstorj.io/storj/storagenode/contact.(*Chore).updateCycles.func1:87\n\tstorj.io/common/sync2.(*Cycle).Run:160\n\tstorj.io/common/sync2.(*Cycle).Start.func1:77\n\tgolang.org/x/sync/errgroup.(*Group).Go.func1:75”}

So is the issue the connection timeout?

Access to that satellite seems to be blocked. Can you check if your ISP isn’t having you being CGNAT ?

Your node is accessible on http://149.75.178.72:46013 right now, so seems it’s likely a problem with your router.
Make sure, that it doesn’t have any throttling “protection” like DDoS prevention, Grey lists or similar features enabled and that your firewall doesn’t block the incoming or outgoing traffic to/from your node.

1 Like

Checked router. Disable Port Scan and DoS Protection was not checked. However the only thing that changed from the past was a new app running on one pc, Spacemesh. Will turn that app off and see if that helps. Otherwise will turn off port scan and dos protection.

Thanks everyone for your help.

It could be possible that you have an UPnP enabled and this program is tried to forward some ports, causing router to misbehave.
But it will be weird anyway. Usually routers are able to do a port forwarding without reset.
Disable Port scan is better to leave enabled though. Neither uplink nor other nodes, include satellites and gateways doesn’t scan ports.

Yes UPnP is turned on. I will keep disable Port scan enabled. Nodes have been good for a few days now so maybe Spacemesh app was the problem. I will give nodes a few months for online score to recover before considering turning Spacemesh back on.

Disable uPNP. If applications need specific ports — forward them manually. UPNP is a security hole among other things (why have a firewall if anyone can punch a hole in it?)

1 Like

Ok. uPnP now disabled.