Your Node is Suspended - saltlake

Hi Team,

Just received the email that i’ve been suspended from Saltlake on both nodes that I have in service but not sure what’s going on, all other satellites are +99% (see pics). Both Nodes that I run have been in service since 2018-2019 with out any issues, config wise nothing has changed. Infact I have barely touched the nodes in the past 6 months besides the occasional OS up date.

I’m not sure why this has happened any tips on what I should check?


Cheers

Blockmania

The suspension is happened for low Online score and unrelated to suspension for low Suspension score @jammerdan

@Blockmania you can take a look when your node did not answer on audit requests from this satellite with these scripts:

The reason is related to your network configuration, perhaps your firewall or ISP are blocking traffic to/from this satellite.

Results of script. Picking a few random days.

  "windowStart": "2022-05-18T00:00:00Z",
  "totalCount": 196,
  "onlineCount": 188
},
{
  "windowStart": "2022-05-18T12:00:00Z",
  "totalCount": 10,
  "onlineCount": 0

  "windowStart": "2022-05-24T00:00:00Z",
  "totalCount": 15,
  "onlineCount": 0
},
{
  "windowStart": "2022-05-24T12:00:00Z",
  "totalCount": 12,
  "onlineCount": 1

I’ve also noted these errors, is this a DNS issue?

2022-05-25T12:00:55.528Z ERROR contact:service ping satellite failed {“Process”: “storagenode”, “Satellite ID”: “1wFTAgs9DP5RSnCqKV1eLf6N9wtk4EAtmN5DpSxcs8EjT69tGE”, “attempts”: 1, “error”: “ping satellite: check-in network: failed to resolve IP from address: storj.blockmania.network:28967, err: lookup storj.blockmania.network on 10.102.0.10:53: read udp 10.100.0.15:37356->10.102.0.10:53: i/o timeout”, “errorVerbose”: “ping satellite: check-in network: failed to resolve IP from address: storj.blockmania.network:28967, err: lookup storj.blockmania.network on 10.102.0.10:53: read udp 10.100.0.15:37356->10.102.0.10:53: i/o timeout\n\tstorj.io/storj/storagenode/contact.(*Service).pingSatelliteOnce:136\n\tstorj.io/storj/storagenode/contact.(*Service).pingSatellite:98\n\tstorj.io/storj/storagenode/contact.(*Chore).updateCycles.func1:87\n\tstorj.io/common/sync2.(*Cycle).Run:152\n\tstorj.io/common/sync2.(*Cycle).Start.func1:71\n\tgolang.org/x/sync/errgroup.(*Group).Go.func1:57”}

Yes, it’s related to DNS. Could you please try to use 8.8.8.8 as a DNS?

@Blockmania We are currently investigating a DNS issue on the Saltlake satellite.

As a workaround for you, assuming your node has a static IP address, I’d recommend configuring the External Address of your node with the actual IP address instead of the storj.blockmania.network DSN name. This way, the satellites would not need to resolve it.

1 Like

@kaloyan Thanks for the info. I spent a few hours checking DNS and in the end I did change the config to use a static IP.

I did wonder if the issue was external to me because of this;
lookup storj.blockmania.network on 10.102.0.10:53: read udp 10.100.0.15:37356->10.102.0.10:53: i/o timeout”

Cheers

Blockmania

What I know from our Infra team is that one of the Saltlake satellite pods fails to resolve your DNS name. The satellite pods run in different GCP regions for high availability. The pod that fails the DNS resolution runs in the GCP us-west2 region. I am not sure what the root cause might be. It could be something wrong with Google or with your DNS provider, or something in-between.