Node Offline... no idea why

I have one sigle node

This just happened to one of my nodes. I took it down for a couple of hours and when I brought it back online the dashboard reports that is is offline.

https://www.yougetsignal.com/ reports the port as opened.

Other nodes on the same network are operational.

Please, check your identity: https://documentation.storj.io/dependencies/identity#confirm-the-identity

It seems things are fine on that end …

grep -c BEGIN ca.cert returns 2

grep -c BEGIN identity.cert returns 3

These are the logs:

2020-08-31T22:23:19.865Z	INFO	Configuration loaded	{"Location": "/app/config/config.yaml"}
2020-08-31T22:23:19.880Z	INFO	Operator email	{"Address": "REDACTED"}
2020-08-31T22:23:19.880Z	INFO	Operator wallet	{"Address": "REDACTED"}
2020-08-31T22:23:20.204Z	INFO	Telemetry enabled
2020-08-31T22:23:20.217Z	INFO	db.migration	Database Version	{"version": 43}
2020-08-31T22:23:20.824Z	INFO	preflight:localtime	start checking local system clock with trusted satellites' system clock.
2020-08-31T22:23:21.391Z	INFO	preflight:localtime	local system clock is in sync with trusted satellites' system clock.
2020-08-31T22:23:21.391Z	INFO	bandwidth	Performing bandwidth usage rollups
2020-08-31T22:23:21.391Z	INFO	trust	Scheduling next refresh	{"after": "5h7m57.232603171s"}
2020-08-31T22:23:21.392Z	INFO	Node [REDACTED] started
2020-08-31T22:23:21.392Z	INFO	Public server started on [::]:28967
2020-08-31T22:23:21.392Z	INFO	Private server started on 127.0.0.1:7778

Also:

Suspension Score = 100%
Audit Score = 100%

One thing to note, is that this node was moved from an arm device to amd64, though I don’t think thats an issue as I have done the same in the past.

Update: moved the hard drive back to the arm device and the node came back online.

If you moved the node but didn’t change the port forwarding rule it will be an issue - each device have an own local IP, so you should update the rule too.

I did update the port and created a new forwarding rule.

Then please check what else in the network configuration is different. Perhaps you have an integrated firewall on the second system. Then you should create an inbound rule for the port. You also need to have a granted outbound access too.
The second thing - when you move the disk with data have you moved the tied identity too?

@Alexey for my case (my node is still OFFLINE) after several days despite being connected and the docker container being up and running.

  • Checked the identity as sugested and can confirm is good (returns 2 & 3 as expected)
  • Checked the firewall/router port forward both manually and via https://www.yougetsignal.com/tools/open-ports/?port=28967 as suggested (have fixed IP and dealt manually with DNS zone… no DynDNS)
  • The node was moved to a new location, but the ISP is the same (basically just different IP… should not be an issue)
  • The node has always being running Dockerized and on the same hardware (just shut it down, moved and fired back up)
  • Nothing else in the network infrastructure changed (same router, same firewall, same everything… only port forward had to be configured again on the new ISP modem/router… but it does work)
  • Other people (or myself for what that matters can access my infrastructure from remote, outside on my LAN)

Do you have any further suggestion to avoid for me any unnecessary downtime or for keeping being penalized for… no reason?

Any suggestion would be greatly appreciated…

Thanks

Can you try to shutdown the node and check your port? Is it closed?

Good point @Alexey, and the answer is yes, the port shows as closed using https://www.yougetsignal.com/tools/open-ports/?port=28967 when the Docker container is down.

Bringing the node back up, did not change anything though:

  • Port now shows as open (of course)
  • Node is still reported as OFFLINE

Please, try a different browser.
Also, your ISP could apply a filter to your traffic
Please, take a look:

Am not sure am following you. What do you mean try another browser?

Am checking the node directly from the CLI, no browser is involved:

Storage Node Dashboard ( Node Version: v1.10.1 )

======================

ID           <redacted>
Last Contact OFFLINE
Uptime       30m49s

                   Available       Used     Egress     Ingress
     Bandwidth           N/A        0 B        0 B         0 B (since Sep 1)
          Disk        2.9 TB     6.1 TB
Internal 127.0.0.1:7778
External <redacted>:28967

My data is uncapped and have FTTH 1Gbps/1Gbps (at current measurement 995Mbps/975Mbps).

I see. Could you give me a NodeID? You can send it via PM.

Found the solution! My fault, of course… Updated a secondary DNS zone and forgot to update the actual master one, therefore the DNS was still pointing to my old IP.
Thanks for the support debugging the issue!

2 Likes

i too am having this same problem, though i am using v3 in a docker on unraid, it’s set to “host” so that it’s the same ip as the server, i checked all the port assignments so there are no conflicts there, i am including all the screens and verifications, and it’s still showing offline.
[Edit] Removed images]
i would apreciate any help given. thank you.

[Edit:] Found my problem, had to have the images in front of me to find it. it was the port assignment ni the docker container changed when it was updating.

Your showing a different port on your dashboard that your using, vs showing which port is open.

thank you, you got to it before i could finish my edit.
lol

Thats perfect then. Was an easy fix.