Storagenode Offline - Unable to connect to satellite rpccompat: connection error

pyoung14 · October 5, 2019, 11:36pm

Hello, my computer shutdown earlier today. I turned it back on and started the docker and its just showing Offline. Its been about 10 minutes. I checked the logs and the only error I see is the one below. Anyone have any ideas? I checked the port using the is my port open website and its showing open. I tried stopping/removing the docker, restarting the computer then starting the docker but same error.

Ubuntu 19.04

2019-10-05T23:29:22.012Z ERROR orders.118UWpMCHzs6CvSgWd9BfFVjw5K9pZbJjkfZJexMtSkmKxvvAW failed to settle orders {“error”: “order: unable to connect to the satellite: rpccompat: connection error: desc = “transport: error while dialing: dial tcp 78.94.240.189:7777: connect: connection refused””, “errorVerbose”: “order: unable to connect to the satellite: rpccompat: connection error: desc = “transport: error while dialing: dial tcp 78.94.240.189:7777: connect: connection refused”\n\tstorj.io/storj/storagenode/orders.(*Service).settle:257\n\tstorj.io/storj/storagenode/orders.(*Service).Settle:196\n\tstorj.io/storj/storagenode/orders.(*Service).sendOrders.func2:175\n\tgolang.org/x/sync/errgroup.(*Group).Go.func1:57”}

ERROR	nodestats:cache	Get disk space usage query failed	{"error": "node stats service error: unable to connect to the satellite 118UWpMCHzs6CvSgWd9BfFVjw5K9pZbJjkfZJexMtSkmKxvvAW: rpccompat: connection error: desc = \"transport: error while dialing: dial tcp 78.94.240.189:7777: connect: connection refused\"", "errorVerbose": "node stats service error: unable to connect to the satellite 118UWpMCHzs6CvSgWd9BfFVjw5K9pZbJjkfZJexMtSkmKxvvAW: rpccompat: connection error: desc = \"transport: error while dialing: dial tcp 78.94.240.189:7777: connect: connection refused\"\n\tstorj.io/storj/storagenode/nodestats.(*Service).dial:129\n\tstorj.io/storj/storagenode/nodestats.(*Service).GetDailyStorageUsage:104\n\tstorj.io/storj/storagenode/nodestats.(*Cache).CacheSpaceUsage.func1:130\n\tstorj.io/storj/storagenode/nodestats.(*Cache).satelliteLoop:170\n\tstorj.io/storj/storagenode/nodestats.(*Cache).CacheSpaceUsage:129\n\tstorj.io/storj/storagenode/nodestats.(*Cache).Run.func2:91\n\tstorj.io/storj/internal/sync2.(*Cycle).Run:87\n\tstorj.io/storj/internal/sync2.(*Cycle).Start.func1:68\n\tgolang.org/x/sync/errgroup.(*Group).Go.func1:57"}

2019-10-05T23:32:23.943Z ERROR nodestats:cache Get stats query failed {“error”: “node stats service error: unable to connect to the satellite 118UWpMCHzs6CvSgWd9BfFVjw5K9pZbJjkfZJexMtSkmKxvvAW: rpccompat: connection error: desc = “transport: error while dialing: dial tcp 78.94.240.189:7777: connect: connection refused””, “errorVerbose”: “node stats service error: unable to connect to the satellite 118UWpMCHzs6CvSgWd9BfFVjw5K9pZbJjkfZJexMtSkmKxvvAW: rpccompat: connection error: desc = “transport: error while dialing: dial tcp 78.94.240.189:7777: connect: connection refused”\n\tstorj.io/storj/storagenode/nodestats.(*Service).dial:129\n\tstorj.io/storj/storagenode/nodestats.(*Service).GetReputationStats:65\n\tstorj.io/storj/storagenode/nodestats.(*Cache).CacheReputationStats.func1:108\n\tstorj.io/storj/storagenode/nodestats.(*Cache).satelliteLoop:170\n\tstorj.io/storj/storagenode/nodestats.(*Cache).CacheReputationStats:107\n\tstorj.io/storj/storagenode/nodestats.(*Cache).Run.func1:79\n\tstorj.io/storj/internal/sync2.(*Cycle).Run:87\n\tstorj.io/storj/internal/sync2.(*Cycle).Start.func1:68\n\tgolang.org/x/sync/errgroup.(*Group).Go.func1:57”}

pyoung14 · October 6, 2019, 12:00am

My storagenode ended up going online after 30 minutes but those errors are still in the logs. So I’m glad its working but still curious what those errors mean and if/how to fix them.

anon27637763 · October 6, 2019, 1:05am

There seems to be some ongoing maintenance on that satellite today.

cameron · October 6, 2019, 1:07am

hi pyoung14,
With the removal of kademlia, we now have storage nodes contact their satellites directly. To avoid overloading the satellites with thousands of incoming requests during auto updates, we implemented a random sleep interval before contact is initiated. That’s why your storage node showed OFFLINE for a while after restarting it. Looks like we might want to change the wording