Contact: service ping satellite failed

Just saw this error for the first time “contact: service ping satellite failed,” any thoughts?

2020-03-20T22:43:03.869Z ERROR contact:service ping satellite failed {“Satellite ID”: “12EayRS2V1kEsWESU9QMRseFhdxYxKicsiFmxrsLZHeLUtdps3S”, “attempts”: 1, “error”: “ping satellite error: rpccompat: dial tcp: lookup us-central-1.tardigrade.io on 192.168.1.1:53: read udp 172.17.0.3:46061->192.168.1.1:53: i/o timeout”, “errorVerbose”: “ping satellite error: rpccompat: dial tcp: lookup us-central-1.tardigrade.io on 192.168.1.1:53: read udp 172.17.0.3:46061->192.168.1.1:53: i/o timeout\n\tstorj.io/common/rpc.Dialer.dialTransport:256\n\tstorj.io/common/rpc.Dialer.dial:233\n\tstorj.io/common/rpc.Dialer.DialAddressID:152\n\tstorj.io/storj/storagenode/contact.(*Service).pingSatelliteOnce:117\n\tstorj.io/storj/storagenode/contact.(*Service).pingSatellite:87\n\tstorj.io/storj/storagenode/contact.(*Chore).updateCycles.func1:87\n\tstorj.io/common/sync2.(*Cycle).Run:147\n\tstorj.io/common/sync2.(*Cycle).Start.func1:68\n\tgolang.org/x/sync/errgroup.(*Group).Go.func1:57”}
2020-03-20T22:43:03.869Z ERROR contact:service ping satellite failed {“Satellite ID”: “118UWpMCHzs6CvSgWd9BfFVjw5K9pZbJjkfZJexMtSkmKxvvAW”, “attempts”: 1, “error”: “ping satellite error: rpccompat: dial tcp: lookup satellite.stefan-benten.de on 192.168.1.1:53: read udp 172.17.0.3:48269->192.168.1.1:53: i/o timeout”, “errorVerbose”: “ping satellite error: rpccompat: dial tcp: lookup satellite.stefan-benten.de on 192.168.1.1:53: read udp 172.17.0.3:48269->192.168.1.1:53: i/o timeout\n\tstorj.io/common/rpc.Dialer.dialTransport:256\n\tstorj.io/common/rpc.Dialer.dial:233\n\tstorj.io/common/rpc.Dialer.DialAddressID:152\n\tstorj.io/storj/storagenode/contact.(*Service).pingSatelliteOnce:117\n\tstorj.io/storj/storagenode/contact.(*Service).pingSatellite:87\n\tstorj.io/storj/storagenode/contact.(*Chore).updateCycles.func1:87\n\tstorj.io/common/sync2.(*Cycle).Run:147\n\tstorj.io/common/sync2.(*Cycle).Start.func1:68\n\tgolang.org/x/sync/errgroup.(*Group).Go.func1:57”}

After reading the exact error, it looks like your node is failing to get a proper reply via DNS for the us-central-1 satellite and Stefan’s testing satellite. What OS are you running and do you have DNS manually configured or is it just pulled from your router via DHCP?

Veddy

PS: @Alexey could you please split this discussion off of the sticky post please? Thanks!

Rasbian and have DNS setup through No IP and DUC running. Node has been running for almost 8 months and dont recall ever seeing the error…

I meant local DNS. Dynamic DNS is great, but this is a failure on your node to resolve the IP addresses associated with satellite hostnames. What DNS servers do you have that particular RPi pointed at? From what it looks like to me, they’re pointed to your router at 192.168.1.1 rather than something like GoogleDNS at 8.8.8.8 and 8.8.4.4. You can try running dig us-central-1.tardigrade.io on your RPi and see if it was just a temporary failure to resolve or if it’s still acting up.

Veddy

Appreciate the assistance on this, although to be honest, I don’t know what you mean by “dig”…could you explain in layman’s terms?

Change your dns to a manually set dns on your router itself so it doesnt use your router as the dns server.

Would a DHCP reservation work just the same for the IP address?

No, Your router is failing to resolve an ip address so its failing to be able to ping it. There should be a dns setting on your router. Or you can change it manually on the rpi itself in the ethernet settings

Okay, I’ll take a look at it then. I just screened through my logs and haven’t seen the error again. The time that it happened earlier today was right after a bandwidth rollup, although I don’t know enough to know if that was related or not.

It could be possible that your router can’t keep up with the dns requests.

Hi,
just wondering if you had an update, I just ran into the same problem with my node (contact:service ping satellite failed) just after a bandwidth rollup. I restarted the node and now it always gives me that error just after startup as well as preflight:localtime unable to get satellite system time
I just setup a noip dns and it’s been running for about an hour. I restarted my router just to double check and it still gives me the same error. It looks like it can’t ping 3 servers (us central, europe west and asia east) but still manages to get traffic from the remaining 3.
Here is the log from when I restarted the node, it keeps trying to ping the servers


I’m running my storagenode on Ubuntu with docker, It’s about 3 weeks old and has been running perfectly until today.

Cheers,
Gab

the satellites are currently under maintenance.
https://status.tardigrade.io/incidents/slz29qswyw1t

Wow thanks a lot @peem that explains it.
I guess I took this storagenode thing a bit too seriously and started freaking out when I saw all these errors hahaha