Node offline after multiple issues: diagnosing help

A classic case of fixing it till it’s broken:

After installing my second PI node (remote), the node went offline 14+ days ago. After running for a few weeks. Had UptimeRobot installed, but notification didn’t reach me. After that I made the mistake of fixing this late-night in a panic-y fashion. Pasting suggested solutions in my CLI without really diagnosing.

  • after all this a restart made it working again (never found out the original problem, but uptime robot gave a green light)
  • then noticed the emailadress in my docker run had a typo
  • after “fixing” that, I had several issues, and now everything is running as normal, but the dashboard still says Offline. publicIP:28967 is not available. But as it had run before, it makes most sense I broke something myself.

How can I diagnose, to check where I went wrong. Thanks so much for you help in advance.

ID     12QSCCTTnRBy2RXdtkE57B9y2RmiuTKaVkSpJVMurcsPycABEss
Status OFFLINE
Uptime 1h54m0s

  Active: active (running) since Thu 2021-09-02 10:36:58 BST; 25min ago
     Docs: https://docs.docker.com
 Main PID: 663 (dockerd)
    Tasks: 36
   CGroup: /system.slice/docker.service
           ├─ 663 /usr/bin/dockerd -H fd:// --containerd=/run/containerd/containerd.sock
           ├─1591 /usr/bin/docker-proxy -proto tcp -host-ip 0.0.0.0 -host-port 28967 -container-ip 172.17.0.3 -container-port 28967
           ├─1599 /usr/bin/docker-proxy -proto tcp -host-ip :: -host-port 28967 -container-ip 172.17.0.3 -container-port 28967
           └─1614 /usr/bin/docker-proxy -proto tcp -host-ip 192.168.1.105 -host-port 14002 -container-ip 172.17.0.3 -container-port 14002

Sep 02 10:36:54 raspberrypi dockerd[663]: time="2021-09-02T10:36:54.940828078+01:00" level=warning msg="Your kernel does not support cgroup blkio weight_devi
Sep 02 10:36:54 raspberrypi dockerd[663]: time="2021-09-02T10:36:54.942098541+01:00" level=info msg="Loading containers: start."
Sep 02 10:36:56 raspberrypi dockerd[663]: time="2021-09-02T10:36:56.087678225+01:00" level=info msg="Default bridge (docker0) is assigned with an IP address
Sep 02 10:36:58 raspberrypi dockerd[663]: time="2021-09-02T10:36:58.490721187+01:00" level=info msg="Loading containers: done."
Sep 02 10:36:58 raspberrypi dockerd[663]: time="2021-09-02T10:36:58.654110316+01:00" level=info msg="Docker daemon" commit=75249d8 graphdriver(s)=overlay2 ve
Sep 02 10:36:58 raspberrypi dockerd[663]: time="2021-09-02T10:36:58.655685983+01:00" level=info msg="Daemon has completed initialization"
Sep 02 10:36:58 raspberrypi systemd[1]: Started Docker Application Container Engine.
Sep 02 10:36:58 raspberrypi dockerd[663]: time="2021-09-02T10:36:58.836776446+01:00" level=info msg="API listen on /var/run/docker.sock"
Sep 02 10:41:38 raspberrypi dockerd[663]: time="2021-09-02T10:41:38.450437858+01:00" level=info msg="ignoring event" container=cba2473d72481be72f31b3c95bd50b
Sep 02 10:42:29 raspberrypi dockerd[663]: time="2021-09-02T10:42:29.597456778+01:00" level=info msg="ignoring event" container=cba2473d72481be72f31b3c95bd50b

Hi Buurable,

Let’s see what we can do. Could you please provide your docker run command (you can anonymize personal info) as well as the last 20 lines of your storage node log?

Hi @baker,

Hmm my previous response is under review. Let’s try again, now with the IP removed from the logs.

Run:

sudo docker run -d --restart always --stop-timeout 300 \
-p 28967:28967 \
-p 192.168.1.105:14002:14002 \
-e WALLET="<<removed>>" \
-e EMAIL="<<removed>>" \
-e ADDRESS="<<removed>>:28967" \
-e STORAGE="4.5TB" \
--mount type=bind,source="/mnt/storj/storagenode/identity",destination=/app/identity \
--mount type=bind,source="/mnt/storj/storagenode",destination=/app/config \
--name storagenode storjlabs/storagenode:latest

Log

2021-09-02T12:03:55.046Z        ERROR   contact:service ping satellite failed   {"Satellite ID": "121RTSDpyNZVcEU84Ticf2L1ntiuUimbWgfATz21tuvgk3vzoA6", "attempts": 9, "error": "ping satellite: failed to dial storage node (ID: 12QSCCTTnRBy2RXdtkE57B9y2RmiuTKaVkSpJVMurcsPycABEss) at address <<removed>>:28967: rpc: dial tcp <<removed>>:28967: connect: connection refused", "errorVerbose": "ping satellite: failed to dial storage node (ID: 12QSCCTTnRBy2RXdtkE57B9y2RmiuTKaVkSpJVMurcsPycABEss) at address <<removed>>:28967: rpc: dial tcp <<removed>>:28967: connect: connection refused\n\tstorj.io/storj/storagenode/contact.(*Service).pingSatelliteOnce:141\n\tstorj.io/storj/storagenode/contact.(*Service).pingSatellite:95\n\tstorj.io/storj/storagenode/contact.(*Chore).updateCycles.func1:87\n\tstorj.io/common/sync2.(*Cycle).Run:152\n\tstorj.io/common/sync2.(*Cycle).Start.func1:71\n\tgolang.org/x/sync/errgroup.(*Group).Go.func1:57"}
2021-09-02T12:07:40.326Z        ERROR   contact:service ping satellite failed   {"Satellite ID": "12L9ZFwhzVpuEKMUNUqkaTLGzwY9G24tbiigLiXpmZWKwmcNDDs", "attempts": 10, "error": "ping satellite: failed to dial storage node (ID: 12QSCCTTnRBy2RXdtkE57B9y2RmiuTKaVkSpJVMurcsPycABEss) at address <<removed>>:28967: rpc: dial tcp <<removed>>:28967: connect: connection refused", "errorVerbose": "ping satellite: failed to dial storage node (ID: 12QSCCTTnRBy2RXdtkE57B9y2RmiuTKaVkSpJVMurcsPycABEss) at address <<removed>>:28967: rpc: dial tcp <<removed>>:28967: connect: connection refused\n\tstorj.io/storj/storagenode/contact.(*Service).pingSatelliteOnce:141\n\tstorj.io/storj/storagenode/contact.(*Service).pingSatellite:95\n\tstorj.io/storj/storagenode/contact.(*Chore).updateCycles.func1:87\n\tstorj.io/common/sync2.(*Cycle).Run:152\n\tstorj.io/common/sync2.(*Cycle).Start.func1:71\n\tgolang.org/x/sync/errgroup.(*Group).Go.func1:57"}
2021-09-02T12:07:42.616Z        ERROR   contact:service ping satellite failed   {"Satellite ID": "12rfG3sh9NCWiX3ivPjq2HtdLmbqCrvHVEzJubnzFzosMuawymB", "attempts": 10, "error": "ping satellite: check-in ratelimit: node rate limited by id", "errorVerbose": "ping satellite: check-in ratelimit: node rate limited by id\n\tstorj.io/storj/storagenode/contact.(*Service).pingSatelliteOnce:138\n\tstorj.io/storj/storagenode/contact.(*Service).pingSatellite:95\n\tstorj.io/storj/storagenode/contact.(*Chore).updateCycles.func1:87\n\tstorj.io/common/sync2.(*Cycle).Run:152\n\tstorj.io/common/sync2.(*Cycle).Start.func1:71\n\tgolang.org/x/sync/errgroup.(*Group).Go.func1:57"}
2021-09-02T12:07:50.987Z        ERROR   contact:service ping satellite failed   {"Satellite ID": "12tRQrMTWUWwzwGh18i7Fqs67kmdhH9t6aToeiwbo5mfS2rUmo", "attempts": 10, "error": "ping satellite: check-in ratelimit: node rate limited by id", "errorVerbose": "ping satellite: check-in ratelimit: node rate limited by id\n\tstorj.io/storj/storagenode/contact.(*Service).pingSatelliteOnce:138\n\tstorj.io/storj/storagenode/contact.(*Service).pingSatellite:95\n\tstorj.io/storj/storagenode/contact.(*Chore).updateCycles.func1:87\n\tstorj.io/common/sync2.(*Cycle).Run:152\n\tstorj.io/common/sync2.(*Cycle).Start.func1:71\n\tgolang.org/x/sync/errgroup.(*Group).Go.func1:57"}
2021-09-02T12:07:55.849Z        ERROR   contact:service ping satellite failed   {"Satellite ID": "1wFTAgs9DP5RSnCqKV1eLf6N9wtk4EAtmN5DpSxcs8EjT69tGE", "attempts": 10, "error": "ping satellite: failed to dial storage node (ID: 12QSCCTTnRBy2RXdtkE57B9y2RmiuTKaVkSpJVMurcsPycABEss) at address <<removed>>:28967: rpc: dial tcp <<removed>>:28967: connect: connection refused", "errorVerbose": "ping satellite: failed to dial storage node (ID: 12QSCCTTnRBy2RXdtkE57B9y2RmiuTKaVkSpJVMurcsPycABEss) at address <<removed>>:28967: rpc: dial tcp <<removed>>:28967: connect: connection refused\n\tstorj.io/storj/storagenode/contact.(*Service).pingSatelliteOnce:141\n\tstorj.io/storj/storagenode/contact.(*Service).pingSatellite:95\n\tstorj.io/storj/storagenode/contact.(*Chore).updateCycles.func1:87\n\tstorj.io/common/sync2.(*Cycle).Run:152\n\tstorj.io/common/sync2.(*Cycle).Start.func1:71\n\tgolang.org/x/sync/errgroup.(*Group).Go.func1:57"}
2021-09-02T12:08:10.696Z        ERROR   contact:service ping satellite failed   {"Satellite ID": "12EayRS2V1kEsWESU9QMRseFhdxYxKicsiFmxrsLZHeLUtdps3S", "attempts": 10, "error": "ping satellite: failed to dial storage node (ID: 12QSCCTTnRBy2RXdtkE57B9y2RmiuTKaVkSpJVMurcsPycABEss) at address <<removed>>:28967: rpc: dial tcp <<removed>>:28967: connect: connection refused", "errorVerbose": "ping satellite: failed to dial storage node (ID: 12QSCCTTnRBy2RXdtkE57B9y2RmiuTKaVkSpJVMurcsPycABEss) at address <<removed>>:28967: rpc: dial tcp <<removed>>:28967: connect: connection refused\n\tstorj.io/storj/storagenode/contact.(*Service).pingSatelliteOnce:141\n\tstorj.io/storj/storagenode/contact.(*Service).pingSatellite:95\n\tstorj.io/storj/storagenode/contact.(*Chore).updateCycles.func1:87\n\tstorj.io/common/sync2.(*Cycle).Run:152\n\tstorj.io/common/sync2.(*Cycle).Start.func1:71\n\tgolang.org/x/sync/errgroup.(*Group).Go.func1:57"}
2021-09-02T12:08:11.819Z        ERROR   contact:service ping satellite failed   {"Satellite ID": "121RTSDpyNZVcEU84Ticf2L1ntiuUimbWgfATz21tuvgk3vzoA6", "attempts": 10, "error": "ping satellite: check-in ratelimit: node rate limited by id", "errorVerbose": "ping satellite: check-in ratelimit: node rate limited by id\n\tstorj.io/storj/storagenode/contact.(*Service).pingSatelliteOnce:138\n\tstorj.io/storj/storagenode/contact.(*Service).pingSatellite:95\n\tstorj.io/storj/storagenode/contact.(*Chore).updateCycles.func1:87\n\tstorj.io/common/sync2.(*Cycle).Run:152\n\tstorj.io/common/sync2.(*Cycle).Start.func1:71\n\tgolang.org/x/sync/errgroup.(*Group).Go.func1:57"}
2021-09-02T12:16:12.514Z        ERROR   contact:service ping satellite failed   {"Satellite ID": "12L9ZFwhzVpuEKMUNUqkaTLGzwY9G24tbiigLiXpmZWKwmcNDDs", "attempts": 11, "error": "ping satellite: failed to dial storage node (ID: 12QSCCTTnRBy2RXdtkE57B9y2RmiuTKaVkSpJVMurcsPycABEss) at address <<removed>>:28967: rpc: dial tcp <<removed>>:28967: connect: connection refused", "errorVerbose": "ping satellite: failed to dial storage node (ID: 12QSCCTTnRBy2RXdtkE57B9y2RmiuTKaVkSpJVMurcsPycABEss) at address <<removed>>:28967: rpc: dial tcp <<removed>>:28967: connect: connection refused\n\tstorj.io/storj/storagenode/contact.(*Service).pingSatelliteOnce:141\n\tstorj.io/storj/storagenode/contact.(*Service).pingSatellite:95\n\tstorj.io/storj/storagenode/contact.(*Chore).updateCycles.func1:87\n\tstorj.io/common/sync2.(*Cycle).Run:152\n\tstorj.io/common/sync2.(*Cycle).Start.func1:71\n\tgolang.org/x/sync/errgroup.(*Group).Go.func1:57"}
2021-09-02T12:16:14.877Z        ERROR   contact:service ping satellite failed   {"Satellite ID": "12rfG3sh9NCWiX3ivPjq2HtdLmbqCrvHVEzJubnzFzosMuawymB", "attempts": 11, "error": "ping satellite: failed to dial storage node (ID: 12QSCCTTnRBy2RXdtkE57B9y2RmiuTKaVkSpJVMurcsPycABEss) at address <<removed>>:28967: rpc: dial tcp <<removed>>:28967: connect: connection refused", "errorVerbose": "ping satellite: failed to dial storage node (ID: 12QSCCTTnRBy2RXdtkE57B9y2RmiuTKaVkSpJVMurcsPycABEss) at address <<removed>>:28967: rpc: dial tcp <<removed>>:28967: connect: connection refused\n\tstorj.io/storj/storagenode/contact.(*Service).pingSatelliteOnce:141\n\tstorj.io/storj/storagenode/contact.(*Service).pingSatellite:95\n\tstorj.io/storj/storagenode/contact.(*Chore).updateCycles.func1:87\n\tstorj.io/common/sync2.(*Cycle).Run:152\n\tstorj.io/common/sync2.(*Cycle).Start.func1:71\n\tgolang.org/x/sync/errgroup.(*Group).Go.func1:57"}
2021-09-02T12:16:23.489Z        ERROR   contact:service ping satellite failed   {"Satellite ID": "12tRQrMTWUWwzwGh18i7Fqs67kmdhH9t6aToeiwbo5mfS2rUmo", "attempts": 11, "error": "ping satellite: failed to dial storage node (ID: 12QSCCTTnRBy2RXdtkE57B9y2RmiuTKaVkSpJVMurcsPycABEss) at address <<removed>>:28967: rpc: dial tcp <<removed>>:28967: connect: connection refused", "errorVerbose": "ping satellite: failed to dial storage node (ID: 12QSCCTTnRBy2RXdtkE57B9y2RmiuTKaVkSpJVMurcsPycABEss) at address <<removed>>:28967: rpc: dial tcp <<removed>>:28967: connect: connection refused\n\tstorj.io/storj/storagenode/contact.(*Service).pingSatelliteOnce:141\n\tstorj.io/storj/storagenode/contact.(*Service).pingSatellite:95\n\tstorj.io/storj/storagenode/contact.(*Chore).updateCycles.func1:87\n\tstorj.io/common/sync2.(*Cycle).Run:152\n\tstorj.io/common/sync2.(*Cycle).Start.func1:71\n\tgolang.org/x/sync/errgroup.(*Group).Go.func1:57"}
2021-09-02T12:16:28.583Z        ERROR   contact:service ping satellite failed   {"Satellite ID": "1wFTAgs9DP5RSnCqKV1eLf6N9wtk4EAtmN5DpSxcs8EjT69tGE", "attempts": 11, "error": "ping satellite: failed to dial storage node (ID: 12QSCCTTnRBy2RXdtkE57B9y2RmiuTKaVkSpJVMurcsPycABEss) at address <<removed>>:28967: rpc: dial tcp <<removed>>:28967: connect: connection refused", "errorVerbose": "ping satellite: failed to dial storage node (ID: 12QSCCTTnRBy2RXdtkE57B9y2RmiuTKaVkSpJVMurcsPycABEss) at address <<removed>>:28967: rpc: dial tcp <<removed>>:28967: connect: connection refused\n\tstorj.io/storj/storagenode/contact.(*Service).pingSatelliteOnce:141\n\tstorj.io/storj/storagenode/contact.(*Service).pingSatellite:95\n\tstorj.io/storj/storagenode/contact.(*Chore).updateCycles.func1:87\n\tstorj.io/common/sync2.(*Cycle).Run:152\n\tstorj.io/common/sync2.(*Cycle).Start.func1:71\n\tgolang.org/x/sync/errgroup.(*Group).Go.func1:57"}
2021-09-02T12:16:44.143Z        ERROR   contact:service ping satellite failed   {"Satellite ID": "12EayRS2V1kEsWESU9QMRseFhdxYxKicsiFmxrsLZHeLUtdps3S", "attempts": 11, "error": "ping satellite: failed to dial storage node (ID: 12QSCCTTnRBy2RXdtkE57B9y2RmiuTKaVkSpJVMurcsPycABEss) at address <<removed>>:28967: rpc: dial tcp <<removed>>:28967: connect: connection refused", "errorVerbose": "ping satellite: failed to dial storage node (ID: 12QSCCTTnRBy2RXdtkE57B9y2RmiuTKaVkSpJVMurcsPycABEss) at address <<removed>>:28967: rpc: dial tcp <<removed>>:28967: connect: connection refused\n\tstorj.io/storj/storagenode/contact.(*Service).pingSatelliteOnce:141\n\tstorj.io/storj/storagenode/contact.(*Service).pingSatellite:95\n\tstorj.io/storj/storagenode/contact.(*Chore).updateCycles.func1:87\n\tstorj.io/common/sync2.(*Cycle).Run:152\n\tstorj.io/common/sync2.(*Cycle).Start.func1:71\n\tgolang.org/x/sync/errgroup.(*Group).Go.func1:57"}
2021-09-02T12:16:44.960Z        ERROR   contact:service ping satellite failed   {"Satellite ID": "121RTSDpyNZVcEU84Ticf2L1ntiuUimbWgfATz21tuvgk3vzoA6", "attempts": 11, "error": "ping satellite: failed to dial storage node (ID: 12QSCCTTnRBy2RXdtkE57B9y2RmiuTKaVkSpJVMurcsPycABEss) at address <<removed>>:28967: rpc: dial tcp <<removed>>:28967: connect: connection refused", "errorVerbose": "ping satellite: failed to dial storage node (ID: 12QSCCTTnRBy2RXdtkE57B9y2RmiuTKaVkSpJVMurcsPycABEss) at address <<removed>>:28967: rpc: dial tcp <<removed>>:28967: connect: connection refused\n\tstorj.io/storj/storagenode/contact.(*Service).pingSatelliteOnce:141\n\tstorj.io/storj/storagenode/contact.(*Service).pingSatellite:95\n\tstorj.io/storj/storagenode/contact.(*Chore).updateCycles.func1:87\n\tstorj.io/common/sync2.(*Cycle).Run:152\n\tstorj.io/common/sync2.(*Cycle).Start.func1:71\n\tgolang.org/x/sync/errgroup.(*Group).Go.func1:57"}
2021-09-02T12:33:16.654Z        ERROR   contact:service ping satellite failed   {"Satellite ID": "12L9ZFwhzVpuEKMUNUqkaTLGzwY9G24tbiigLiXpmZWKwmcNDDs", "attempts": 12, "error": "ping satellite: failed to dial storage node (ID: 12QSCCTTnRBy2RXdtkE57B9y2RmiuTKaVkSpJVMurcsPycABEss) at address <<removed>>:28967: rpc: dial tcp <<removed>>:28967: connect: connection refused", "errorVerbose": "ping satellite: failed to dial storage node (ID: 12QSCCTTnRBy2RXdtkE57B9y2RmiuTKaVkSpJVMurcsPycABEss) at address <<removed>>:28967: rpc: dial tcp <<removed>>:28967: connect: connection refused\n\tstorj.io/storj/storagenode/contact.(*Service).pingSatelliteOnce:141\n\tstorj.io/storj/storagenode/contact.(*Service).pingSatellite:95\n\tstorj.io/storj/storagenode/contact.(*Chore).updateCycles.func1:87\n\tstorj.io/common/sync2.(*Cycle).Run:152\n\tstorj.io/common/sync2.(*Cycle).Start.func1:71\n\tgolang.org/x/sync/errgroup.(*Group).Go.func1:57"}
2021-09-02T12:33:19.154Z        ERROR   contact:service ping satellite failed   {"Satellite ID": "12rfG3sh9NCWiX3ivPjq2HtdLmbqCrvHVEzJubnzFzosMuawymB", "attempts": 12, "error": "ping satellite: failed to dial storage node (ID: 12QSCCTTnRBy2RXdtkE57B9y2RmiuTKaVkSpJVMurcsPycABEss) at address <<removed>>:28967: rpc: dial tcp <<removed>>:28967: connect: connection refused", "errorVerbose": "ping satellite: failed to dial storage node (ID: 12QSCCTTnRBy2RXdtkE57B9y2RmiuTKaVkSpJVMurcsPycABEss) at address <<removed>>:28967: rpc: dial tcp <<removed>>:28967: connect: connection refused\n\tstorj.io/storj/storagenode/contact.(*Service).pingSatelliteOnce:141\n\tstorj.io/storj/storagenode/contact.(*Service).pingSatellite:95\n\tstorj.io/storj/storagenode/contact.(*Chore).updateCycles.func1:87\n\tstorj.io/common/sync2.(*Cycle).Run:152\n\tstorj.io/common/sync2.(*Cycle).Start.func1:71\n\tgolang.org/x/sync/errgroup.(*Group).Go.func1:57"}
2021-09-02T12:33:28.083Z        ERROR   contact:service ping satellite failed   {"Satellite ID": "12tRQrMTWUWwzwGh18i7Fqs67kmdhH9t6aToeiwbo5mfS2rUmo", "attempts": 12, "error": "ping satellite: failed to dial storage node (ID: 12QSCCTTnRBy2RXdtkE57B9y2RmiuTKaVkSpJVMurcsPycABEss) at address <<removed>>:28967: rpc: dial tcp <<removed>>:28967: connect: connection refused", "errorVerbose": "ping satellite: failed to dial storage node (ID: 12QSCCTTnRBy2RXdtkE57B9y2RmiuTKaVkSpJVMurcsPycABEss) at address <<removed>>:28967: rpc: dial tcp <<removed>>:28967: connect: connection refused\n\tstorj.io/storj/storagenode/contact.(*Service).pingSatelliteOnce:141\n\tstorj.io/storj/storagenode/contact.(*Service).pingSatellite:95\n\tstorj.io/storj/storagenode/contact.(*Chore).updateCycles.func1:87\n\tstorj.io/common/sync2.(*Cycle).Run:152\n\tstorj.io/common/sync2.(*Cycle).Start.func1:71\n\tgolang.org/x/sync/errgroup.(*Group).Go.func1:57"}
2021-09-02T12:33:33.283Z        ERROR   contact:service ping satellite failed   {"Satellite ID": "1wFTAgs9DP5RSnCqKV1eLf6N9wtk4EAtmN5DpSxcs8EjT69tGE", "attempts": 12, "error": "ping satellite: failed to dial storage node (ID: 12QSCCTTnRBy2RXdtkE57B9y2RmiuTKaVkSpJVMurcsPycABEss) at address <<removed>>:28967: rpc: dial tcp <<removed>>:28967: connect: connection refused", "errorVerbose": "ping satellite: failed to dial storage node (ID: 12QSCCTTnRBy2RXdtkE57B9y2RmiuTKaVkSpJVMurcsPycABEss) at address <<removed>>:28967: rpc: dial tcp <<removed>>:28967: connect: connection refused\n\tstorj.io/storj/storagenode/contact.(*Service).pingSatelliteOnce:141\n\tstorj.io/storj/storagenode/contact.(*Service).pingSatellite:95\n\tstorj.io/storj/storagenode/contact.(*Chore).updateCycles.func1:87\n\tstorj.io/common/sync2.(*Cycle).Run:152\n\tstorj.io/common/sync2.(*Cycle).Start.func1:71\n\tgolang.org/x/sync/errgroup.(*Group).Go.func1:57"}
2021-09-02T12:33:49.614Z        ERROR   contact:service ping satellite failed   {"Satellite ID": "12EayRS2V1kEsWESU9QMRseFhdxYxKicsiFmxrsLZHeLUtdps3S", "attempts": 12, "error": "ping satellite: failed to dial storage node (ID: 12QSCCTTnRBy2RXdtkE57B9y2RmiuTKaVkSpJVMurcsPycABEss) at address <<removed>>:28967: rpc: dial tcp <<removed>>:28967: connect: connection refused", "errorVerbose": "ping satellite: failed to dial storage node (ID: 12QSCCTTnRBy2RXdtkE57B9y2RmiuTKaVkSpJVMurcsPycABEss) at address <<removed>>:28967: rpc: dial tcp <<removed>>:28967: connect: connection refused\n\tstorj.io/storj/storagenode/contact.(*Service).pingSatelliteOnce:141\n\tstorj.io/storj/storagenode/contact.(*Service).pingSatellite:95\n\tstorj.io/storj/storagenode/contact.(*Chore).updateCycles.func1:87\n\tstorj.io/common/sync2.(*Cycle).Run:152\n\tstorj.io/common/sync2.(*Cycle).Start.func1:71\n\tgolang.org/x/sync/errgroup.(*Group).Go.func1:57"}
2021-09-02T12:33:50.076Z        ERROR   contact:service ping satellite failed   {"Satellite ID": "121RTSDpyNZVcEU84Ticf2L1ntiuUimbWgfATz21tuvgk3vzoA6", "attempts": 12, "error": "ping satellite: failed to dial storage node (ID: 12QSCCTTnRBy2RXdtkE57B9y2RmiuTKaVkSpJVMurcsPycABEss) at address <<removed>>:28967: rpc: dial tcp <<removed>>:28967: connect: connection refused", "errorVerbose": "ping satellite: failed to dial storage node (ID: 12QSCCTTnRBy2RXdtkE57B9y2RmiuTKaVkSpJVMurcsPycABEss) at address <<removed>>:28967: rpc: dial tcp <<removed>>:28967: connect: connection refused\n\tstorj.io/storj/storagenode/contact.(*Service).pingSatelliteOnce:141\n\tstorj.io/storj/storagenode/contact.(*Service).pingSatellite:95\n\tstorj.io/storj/storagenode/contact.(*Chore).updateCycles.func1:87\n\tstorj.io/common/sync2.(*Cycle).Run:152\n\tstorj.io/common/sync2.(*Cycle).Start.func1:71\n\tgolang.org/x/sync/errgroup.(*Group).Go.func1:57"}
2021-09-02T12:42:33.722Z        INFO    bandwidth       Performing bandwidth usage rollups

Strange. I did see it for a few minutes, then it disappeared.

Looks like your node can’t be reached from the outside. I would check your port forwarding on your router, check that your raspberry pi internal IP is static and matches your port forwarding, and that your DDNS name is updating properly (if you use DDNS).

–edit–
It is strange that your posts are being flagged. I’m sure a mod will fix it shortly.

2 Likes

Also make sure no firewall is blocking incoming traffic on port 28967 or outgoing traffic to any port/ip.

The node was working fine at some point. Something I wouldn’t expect from a port not available. Maybe I’m wrong.

I’ve now check the router as well. Binding and port forward (both for ssh and docker) looks fine. I’m also allowing the port in ufh. Not sure what else to check from there.

Make sure to check from an external tool. Keep in mind the port will only return open if the docker container is running (and all other port forwarding is correct).
https://www.yougetsignal.com/tools/open-ports/

Checked. It’s closed. But I’m not able to pinpoint where it’s not allowing.

I checked this, does this seem alright to you?

pi@raspberrypi:~ $ sudo netstat -ltnp | grep ':28967'
tcp        0      0 0.0.0.0:28967           0.0.0.0:*               LISTEN      1591/docker-proxy
tcp6       0      0 :::28967                :::*                    LISTEN      1599/docker-proxy

Unfortunately I don’t run UFW on my linux node, so I am unsure. I would try disabling UFW on the pi first. That’s probably the most complicated part of the chain (in my experience). Also check that your public IP as reported by a checking tool matches the public IP you see in your WAN fields on your router. If they don’t match you could have an issue with your ISP forcing CGNAT.

Are you also running a DDNS? And do you have more then one node running on the same port?

@baker feel the fact that I can SSH login to the node, that the public IP is alright. the settings in the router confirm this is the public IP.

@deathlessdd I’m not running a DDNS, both not on the router nor on the node.

Could you please show result of the command:

sudo ufw status

Also, this

forced me to think, that you used different IPs, not only different ports.
Please, make sure that your “docker” port forwarding rule has the same destination IP as a ssh one.
And please, double check your public IP on yougesignal, WAN IP and IP used in -e ADDRESS.

@Alexey thanks for joining

uwf status.

pi@raspberrypi:~ $ sudo ufw status
Status: active

To                         Action      From
--                         ------      ----
14002                      ALLOW       Anywhere                  
2032                       ALLOW       Anywhere                  
28967                      ALLOW       Anywhere                  
2033 (v6)                  ALLOW       Anywhere (v6)             
14002 (v6)                 ALLOW       Anywhere (v6)             
2032 (v6)                  ALLOW       Anywhere (v6)             
28967 (v6)                 ALLOW       Anywhere (v6)             

28967                      ALLOW OUT   Anywhere                  
28967 (v6)                 ALLOW OUT   Anywhere (v6)  

Let me recheck the values again.

  • Public IP/port says closed, though I can still login through ssh. Which make me sure the public IP is correct.
  • WAN IP, hasn’t changed either. Also has a router port forward for both docker and custom ssh port. If anything was wrong with it I couldn’t login to ssh I presume.
  • I’ll recheck the -e adress again.

Please remove any outbound rules, or allow ANY outbound port

Done, this was only a try-out from earlier. So no difference. I think I’ll reinstall, with new identity. The node hasn’t been online for long.

Please check that you really have the same IP in the ADDRESS option as you using for ssh.
Please check also the identity:

grep -c BEGIN /mnt/storj/storagenode/identity/ca.cert
grep -c BEGIN /mnt/storj/storagenode/identity/identity.cert