Node stopped working - getting 403 for url https://www.storj.io/dcs-satellites

Hi! At 07.08.2021 approx at 3:00 node stopped receiving traffic.

I’ve restarted all nodes. Here is startup logs from one of nodes.

storj    | 2021-08-10T12:05:43.865Z	INFO	Telemetry enabled	{"instance ID": "12hfB361Hc5FPCmpjHc4br57HnYbXZYFbWiFTNGe8skrzNXwZ2W"}
storj    | 2021-08-10T12:05:43.888Z	INFO	db.migration	Database Version	{"version": 53}
storj    | 2021-08-10T12:05:44.112Z	WARN	trust	Failed to fetch URLs from source; used cache	{"source": "https://www.storj.io/dcs-satellites", "error": "HTTP source: \"https://www.storj.io/dcs-satellites\": unexpected status code 403: \"<!DOCTYPE html>\"", "errorVerbose": "HTTP source: \"https://www.storj.io/dcs-satellites\": unexpected status code 403: \"<!DOCTYPE html>\"\n\tstorj.io/storj/storagenode/trust.(*HTTPSource).FetchEntries:76\n\tstorj.io/storj/storagenode/trust.(*List).fetchEntries:90\n\tstorj.io/storj/storagenode/trust.(*List).FetchURLs:49\n\tstorj.io/storj/storagenode/trust.(*Pool).fetchURLs:250\n\tstorj.io/storj/storagenode/trust.(*Pool).Refresh:187\n\tstorj.io/storj/storagenode.(*Peer).Run:842\n\tmain.cmdRun:218\n\tstorj.io/private/process.cleanup.func1.4:363\n\tstorj.io/private/process.cleanup.func1:381\n\tgithub.com/spf13/cobra.(*Command).execute:852\n\tgithub.com/spf13/cobra.(*Command).ExecuteC:960\n\tgithub.com/spf13/cobra.(*Command).Execute:897\n\tstorj.io/private/process.ExecWithCustomConfig:88\n\tstorj.io/private/process.ExecCustomDebug:70\n\tmain.main:385\n\truntime.main:204"}
storj    | 2021-08-10T12:05:44.113Z	INFO	preflight:localtime	start checking local system clock with trusted satellites' system clock.
storj    | 2021-08-10T12:06:04.114Z	ERROR	preflight:localtime	unable to get satellite system time	{"Satellite ID": "121RTSDpyNZVcEU84Ticf2L1ntiuUimbWgfATz21tuvgk3vzoA6", "error": "rpc: dial tcp 35.229.248.166:7777: i/o timeout", "errorVerbose": "rpc: dial tcp 35.229.248.166:7777: i/o timeout\n\tstorj.io/common/rpc.TCPConnector.DialContextUnencrypted:107\n\tstorj.io/common/rpc.TCPConnector.DialContext:71\n\tstorj.io/common/rpc.Dialer.dialEncryptedConn:197\n\tstorj.io/common/rpc.Dialer.DialNodeURL.func1:107\n\tstorj.io/common/rpc/rpcpool.(*Pool).get:90\n\tstorj.io/common/rpc/rpcpool.(*Pool).Get:110\n\tstorj.io/common/rpc.Dialer.dialPool:163\n\tstorj.io/common/rpc.Dialer.DialNodeURL:106\n\tstorj.io/storj/storagenode/preflight.(*LocalTime).getSatelliteTime:110\n\tstorj.io/storj/storagenode/preflight.(*LocalTime).Check.func1:67\n\tgolang.org/x/sync/errgroup.(*Group).Go.func1:57"}
storj    | 2021-08-10T12:06:04.115Z	ERROR	preflight:localtime	unable to get satellite system time	{"Satellite ID": "12tRQrMTWUWwzwGh18i7Fqs67kmdhH9t6aToeiwbo5mfS2rUmo", "error": "rpc: dial tcp 34.86.70.247:7777: i/o timeout", "errorVerbose": "rpc: dial tcp 34.86.70.247:7777: i/o timeout\n\tstorj.io/common/rpc.TCPConnector.DialContextUnencrypted:107\n\tstorj.io/common/rpc.TCPConnector.DialContext:71\n\tstorj.io/common/rpc.Dialer.dialEncryptedConn:197\n\tstorj.io/common/rpc.Dialer.DialNodeURL.func1:107\n\tstorj.io/common/rpc/rpcpool.(*Pool).get:90\n\tstorj.io/common/rpc/rpcpool.(*Pool).Get:110\n\tstorj.io/common/rpc.Dialer.dialPool:163\n\tstorj.io/common/rpc.Dialer.DialNodeURL:106\n\tstorj.io/storj/storagenode/preflight.(*LocalTime).getSatelliteTime:110\n\tstorj.io/storj/storagenode/preflight.(*LocalTime).Check.func1:67\n\tgolang.org/x/sync/errgroup.(*Group).Go.func1:57"}
storj    | 2021-08-10T12:06:04.115Z	ERROR	preflight:localtime	unable to get satellite system time	{"Satellite ID": "1wFTAgs9DP5RSnCqKV1eLf6N9wtk4EAtmN5DpSxcs8EjT69tGE", "error": "rpc: dial tcp 34.94.230.169:7777: i/o timeout", "errorVerbose": "rpc: dial tcp 34.94.230.169:7777: i/o timeout\n\tstorj.io/common/rpc.TCPConnector.DialContextUnencrypted:107\n\tstorj.io/common/rpc.TCPConnector.DialContext:71\n\tstorj.io/common/rpc.Dialer.dialEncryptedConn:197\n\tstorj.io/common/rpc.Dialer.DialNodeURL.func1:107\n\tstorj.io/common/rpc/rpcpool.(*Pool).get:90\n\tstorj.io/common/rpc/rpcpool.(*Pool).Get:110\n\tstorj.io/common/rpc.Dialer.dialPool:163\n\tstorj.io/common/rpc.Dialer.DialNodeURL:106\n\tstorj.io/storj/storagenode/preflight.(*LocalTime).getSatelliteTime:110\n\tstorj.io/storj/storagenode/preflight.(*LocalTime).Check.func1:67\n\tgolang.org/x/sync/errgroup.(*Group).Go.func1:57"}
storj    | 2021-08-10T12:06:04.115Z	ERROR	preflight:localtime	unable to get satellite system time	{"Satellite ID": "12EayRS2V1kEsWESU9QMRseFhdxYxKicsiFmxrsLZHeLUtdps3S", "error": "rpc: dial tcp 34.70.108.161:7777: i/o timeout", "errorVerbose": "rpc: dial tcp 34.70.108.161:7777: i/o timeout\n\tstorj.io/common/rpc.TCPConnector.DialContextUnencrypted:107\n\tstorj.io/common/rpc.TCPConnector.DialContext:71\n\tstorj.io/common/rpc.Dialer.dialEncryptedConn:197\n\tstorj.io/common/rpc.Dialer.DialNodeURL.func1:107\n\tstorj.io/common/rpc/rpcpool.(*Pool).get:90\n\tstorj.io/common/rpc/rpcpool.(*Pool).Get:110\n\tstorj.io/common/rpc.Dialer.dialPool:163\n\tstorj.io/common/rpc.Dialer.DialNodeURL:106\n\tstorj.io/storj/storagenode/preflight.(*LocalTime).getSatelliteTime:110\n\tstorj.io/storj/storagenode/preflight.(*LocalTime).Check.func1:67\n\tgolang.org/x/sync/errgroup.(*Group).Go.func1:57"}
storj    | 2021-08-10T12:06:04.116Z	ERROR	preflight:localtime	unable to get satellite system time	{"Satellite ID": "12L9ZFwhzVpuEKMUNUqkaTLGzwY9G24tbiigLiXpmZWKwmcNDDs", "error": "rpc: dial tcp 34.76.213.14:7777: i/o timeout", "errorVerbose": "rpc: dial tcp 34.76.213.14:7777: i/o timeout\n\tstorj.io/common/rpc.TCPConnector.DialContextUnencrypted:107\n\tstorj.io/common/rpc.TCPConnector.DialContext:71\n\tstorj.io/common/rpc.Dialer.dialEncryptedConn:197\n\tstorj.io/common/rpc.Dialer.DialNodeURL.func1:107\n\tstorj.io/common/rpc/rpcpool.(*Pool).get:90\n\tstorj.io/common/rpc/rpcpool.(*Pool).Get:110\n\tstorj.io/common/rpc.Dialer.dialPool:163\n\tstorj.io/common/rpc.Dialer.DialNodeURL:106\n\tstorj.io/storj/storagenode/preflight.(*LocalTime).getSatelliteTime:110\n\tstorj.io/storj/storagenode/preflight.(*LocalTime).Check.func1:67\n\tgolang.org/x/sync/errgroup.(*Group).Go.func1:57"}
storj    | 2021-08-10T12:06:04.116Z	ERROR	preflight:localtime	unable to get satellite system time	{"Satellite ID": "12rfG3sh9NCWiX3ivPjq2HtdLmbqCrvHVEzJubnzFzosMuawymB", "error": "rpc: dial tcp 35.228.118.14:7777: i/o timeout", "errorVerbose": "rpc: dial tcp 35.228.118.14:7777: i/o timeout\n\tstorj.io/common/rpc.TCPConnector.DialContextUnencrypted:107\n\tstorj.io/common/rpc.TCPConnector.DialContext:71\n\tstorj.io/common/rpc.Dialer.dialEncryptedConn:197\n\tstorj.io/common/rpc.Dialer.DialNodeURL.func1:107\n\tstorj.io/common/rpc/rpcpool.(*Pool).get:90\n\tstorj.io/common/rpc/rpcpool.(*Pool).Get:110\n\tstorj.io/common/rpc.Dialer.dialPool:163\n\tstorj.io/common/rpc.Dialer.DialNodeURL:106\n\tstorj.io/storj/storagenode/preflight.(*LocalTime).getSatelliteTime:110\n\tstorj.io/storj/storagenode/preflight.(*LocalTime).Check.func1:67\n\tgolang.org/x/sync/errgroup.(*Group).Go.func1:57"}
storj    | 2021-08-10T12:06:04.116Z	ERROR	Failed preflight check.	{"error": "system clock is out of sync: system clock is out of sync with all trusted satellites", "errorVerbose": "system clock is out of sync: system clock is out of sync with all trusted satellites\n\tstorj.io/storj/storagenode/preflight.(*LocalTime).Check:96\n\tstorj.io/storj/storagenode.(*Peer).Run:846\n\tmain.cmdRun:218\n\tstorj.io/private/process.cleanup.func1.4:363\n\tstorj.io/private/process.cleanup.func1:381\n\tgithub.com/spf13/cobra.(*Command).execute:852\n\tgithub.com/spf13/cobra.(*Command).ExecuteC:960\n\tgithub.com/spf13/cobra.(*Command).Execute:897\n\tstorj.io/private/process.ExecWithCustomConfig:88\n\tstorj.io/private/process.ExecCustomDebug:70\n\tmain.main:385\n\truntime.main:204"}
storj    | Error: system clock is out of sync: system clock is out of sync with all trusted satellites

So nodes not starting up.

Looks like there is some acl (by ip?) that my node not passes.

Any ideas how to find out whats wrong?

Check your time against some ntp servers.

on host system ntp looks like ok.

timedatectl status
               Local time: Tue 2021-08-10 19:20:55 MSK
           Universal time: Tue 2021-08-10 16:20:55 UTC
                 RTC time: Tue 2021-08-10 16:20:55
                Time zone: Europe/Moscow (MSK, +0300)
System clock synchronized: yes
              NTP service: active
          RTC in local TZ: no
1 Like

I think core of a problem in this message:

storj    | 2021-08-10T12:05:44.112Z	WARN	trust	Failed to fetch URLs from source; used cache	{"source": "https://www.storj.io/dcs-satellites", "error": "HTTP source: \"https://www.storj.io/dcs-satellites\": unexpected status code 403:

But i am dunno how to solve this.

I think this line is not the real culprit. The node should continue to work fine even if it can’t update its list of trusted satellites.

Your problem is probably indicated by the errors from trying to contact all the satellites:

rpc: dial tcp 35.229.248.166:7777: i/o timeout
rpc: dial tcp 34.86.70.247:7777: i/o timeout
rpc: dial tcp 34.94.230.169:7777: i/o timeout
rpc: dial tcp 34.70.108.161:7777: i/o timeout
rpc: dial tcp 34.76.213.14:7777: i/o timeout
rpc: dial tcp 35.228.118.14:7777: i/o timeout

This indicates your node can’t initiate a TCP connection to any of the known satellite IPs. Every time it tries, it gets nothing back and the operation times out. You can try to create a connection manually from your node using the openssl command line tool:

openssl s_client -connect 35.228.118.14:7777

If all the necessary pieces of the Internet are working correctly, you should see the message “CONNECTED” right away. If you don’t, you’ll need to diagnose why your node can’t create outgoing connections to public IPs.

3 Likes

I’ve tried connecting to all of this ip’s from two ip’s. Got the same result - timeout.

openssl s_client -connect 35.228.118.14:7777
140372637169024:error:0200206E:system library:connect:Connection timed out:crypto/bio/b_sock2.c:110:
140372637169024:error:2008A067:BIO routines:BIO_connect:connect error:crypto/bio/b_sock2.c:111:
connect:errno=110

Can someone try same command in working environment (i can’t currently test in more different networks).
(at least this as in example)

openssl s_client -connect 35.228.118.14:7777

Another research.

I’ve getting 403 from my two networks. But i’ve tried https://www.storj.io/dcs-satellites from tor (just researching) and got 200 OK with list of satellites.

And the same result of connect command.

torify openssl s_client -connect 35.228.118.14:7777
CONNECTED(00000003)

So is this acl by ip?

$ openssl s_client -connect 35.228.118.14:7777
CONNECTED(00000003)

Maybe it’s your firewall or firewall of your ISP. If you have a firewall in your network, you should not block any traffic from the PC with storagenode, and you should allow incoming traffic from any source to the TCP+UDP port of your node.

1 Like

No, the public satellites have no IP access control applied. If you can’t connect there, something is wrong with your network setup or your provider is blocking your connections for some reason. You could try connecting to Google instead to see if that works.

2 Likes

I’ve fixed that error by requesting new IP from my ISP. I was getting same 403 for some google services.
Looks strange. No other changes on my side. Maybe ISP has some rules underhood for this subnet.

Anyway some of my nodes was disqualified by the time i figured this out(

Ah sh**t here we go again. Almost 8TB and 2-3 years pretty good uptime.