One of two local nodes randomly gone offline

Hi all,

over the weekend i recently set up a new node on a local network on a separate machine. Everything is believed to be set up correctly and have set up a different port forwarding for second node. All has gone well both nodes have been online together and new node has received about 30gb of data.

The problem is the older node which hasn’t been touched and only gets remotely connected a couple of times a day suddenly says its offline with last contact 17702030 hours ago. 1H 30mins uptime but shows offline in red. i’ve not changed anything on this node and was fine in the early hours of this morning.

I’m running windows 10 with gui on both nodes 1st node has been up many months (last year sometime) but the 9TB of storage on it is full to within about 3gb.

How do i view logs on gui to try and solve issue? Just to add uptime monitor hasn’t been triggered by this being offline.

Try this checklist

This is a screen shot from logs.

To be fair i’m lost with it now. I’ve not altered anything on this node. There’s 8.8tb allocated. dashboard says 3.53gb left and according to properties of drive there is 337gb of free space left. everything shows port open and second node is operating fine.

Please, check your port forwarding rule. You now should have two - one for old with 28967 external and 28967 internal ports, and the second with a different external port.
For the second node you can setup port forwarding in two different ways:

  • forward a different external port, for example 28968 to the 28967 but with an IP of your second PC

or

  • forward a different external port, for example 28968 to the 28968 and IP of the second PC. In this case you also need to change the port in the server.address: option of the config file of the second node (assuming it’s Windows GUI too; if it’s a docker, then you need to change only port mapping to -p 28968:28967)

In both cases the external address should include your external port, for example, external.address.tld:28968

You can read more there:

first machine (the one with the problem) is set up 28967-28967 to 192.168.1.118 on router
new node has 28968-28967 to 192.168.1.158 on router

Both commands on each pc show (my.ddns.net:28967)

Is that correct ?

No, the first one should have my.ddns.net:28967, the second one my.ddns.net:28968

I’ve altered the port number on the config on the second node and that is still running. the first node still shows offline. There is disk activity but dashboard shows as offline. In the logs i’ve got a fatal unrecoverable error and a warning used more space than allocated yet i don’t see how this is possible with 337gb free on disc and dashboard still showing 3.53gb disk space remaining. The warning and the fatal error are the same ones as shown in screen shot above.

The fatal usually mean that node stop to working.
Please, show the last 20 lines from the log of the first node.

PS C:\Users\stuart> Get-Content “$env:ProgramFiles/Storj/Storage Node/storagenode.log” -Tail 20 -Wait
2020-06-09T21:27:21.529+0100 INFO trust Scheduling next refresh {“after”: “4h8m6.494195605s”}
2020-06-09T21:27:21.529+0100 INFO bandwidth Performing bandwidth usage rollups
2020-06-09T21:27:21.530+0100 WARN piecestore:monitor Used more space than allocated. Allocating space {“bytes”: 8800000000000}
2020-06-09T22:27:21.602+0100 INFO bandwidth Performing bandwidth usage rollups
2020-06-09T23:27:21.672+0100 INFO bandwidth Performing bandwidth usage rollups
2020-06-09T23:51:50.649+0100 INFO Stop/Shutdown request received.
2020-06-09T23:51:52.950+0100 FATAL Unrecoverable error {“error”: “debug: http: Server closed”, “errorVerbose”: “debug: http: Server closed\n\tstorj.io/private/debug.(*Server).Run.func2:108\n\tgolang.org/x/sync/errgroup.(*Group).Go.func1:57”}
2020-06-09T23:52:04.523+0100 INFO Configuration loaded {“Location”: “C:\Program Files\Storj\Storage Node\config.yaml”}
2020-06-09T23:52:04.562+0100 INFO Operator email {“Address”: “myemail@googlemail.com”}
2020-06-09T23:52:04.562+0100 INFO Operator wallet {“Address”: “my payment address”}
2020-06-09T23:52:05.414+0100 INFO Telemetry enabled
2020-06-09T23:52:05.434+0100 INFO db.migration Database Version {“version”: 39}
2020-06-09T23:52:06.680+0100 INFO preflight:localtime start checking local system clock with trusted satellites’ system clock.
2020-06-09T23:52:08.004+0100 INFO preflight:localtime local system clock is in sync with trusted satellites’ system clock.
2020-06-09T23:52:08.004+0100 INFO bandwidth Performing bandwidth usage rollups
2020-06-09T23:52:08.004+0100 INFO trust Scheduling next refresh {“after”: “6h7m6.119479597s”}
2020-06-09T23:52:08.004+0100 INFO Node 12gkDpqRzzBdSxGQ4cpjbFpg8unNAv9QT1JEBi9BpaxzxHK1Nd3 started
2020-06-09T23:52:08.004+0100 INFO Public server started on [::]:28967
2020-06-09T23:52:08.004+0100 INFO Private server started on 127.0.0.1:7778
2020-06-09T23:52:08.006+0100 WARN piecestore:monitor Used more space than allocated. Allocating space {“bytes”: 8800000000000}

Please, give me result of the command (Powershell):

sls debug "C:\Program Files\Storj\Storage Node\config.yaml"

I’m not getting any output i’ll try again

C:\Program Files\Storj\Storage Node\config.yaml:31:# address to listen on for debug endpoints
C:\Program Files\Storj\Storage Node\config.yaml:32:# debug.addr: 127.0.0.1:0
C:\Program Files\Storj\Storage Node\config.yaml:35:# debug.trace-out: “”
C:\Program Files\Storj\Storage Node\config.yaml:118:# allows configuration to enable, disable, or test retain requests
from the satellite. Options: (disabled/enabled/debug)
C:\Program Files\Storj\Storage Node\config.yaml:125:server.debug-log-traffic: false

Have you changed the contact.external-address: for the second node?
If so, please, try to restart both nodes from elevated Powershell on each of the PC:

Restart-Service storagenode

done this on both nodes and both are offline.

Are both of these nodes now defunked? i’ve got $200 with held on first node it’ll be a shame to loss the time that the node has been running.

Please, update your DDNS hostname with the current public IP
Please, check your address and port on https://www.yougetsignal.com/tools/open-ports/ : put your external DDNS address into Remote Address field, the port to the Port Number field and click Check.
Compare the IP from that site with your WAN IP on the status page of your router, they should match.

Show your port forwarding rules for both nodes.

Add a firewall rules on both of your nodes to allow 28967 port in the elevated powershell:

New-NetFirewallRule -DisplayName “Storj v3” -Direction Inbound -Protocol TCP -LocalPort 28967 -Action allow

Give me result of the Powershell command for each node:

Select-String "server.address" "C:\Program Files\Storj\Storage Node\config.yaml"
Select-String "contact.external-address:" "C:\Program Files\Storj\Storage Node\config.yaml"

new node

PS C:\Windows\system32> Select-String “server.address” “C:\Program Files\Storj\Storage Node\config.yaml”

C:\Program Files\Storj\Storage Node\config.yaml:10:# server address of the api gateway and frontend app
C:\Program Files\Storj\Storage Node\config.yaml:140:server.address: :28967
C:\Program Files\Storj\Storage Node\config.yaml:277:# server address to check its version against
C:\Program Files\Storj\Storage Node\config.yaml:278:# version.server-address: https://version.storj.io

PS C:\Windows\system32> Select-String “contact.external-address:” “C:\Program Files\Storj\Storage Node\config.yaml”

C:\Program Files\Storj\Storage Node\config.yaml:17:contact.external-address: pioneerdj.ddns.net:28968

old node

C:\Windows\system32> Select-String “server.address” “C:\Program Files\Storj\Storage Node\config.yaml”

C:\Program Files\Storj\Storage Node\config.yaml:10:# server address of the api gateway and frontend app
C:\Program Files\Storj\Storage Node\config.yaml:122:server.address: :28967
C:\Program Files\Storj\Storage Node\config.yaml:208:# server address to check its version against
C:\Program Files\Storj\Storage Node\config.yaml:209:# version.server-address: https://version.storj.io

S C:\Windows\system32> Select-String “contact.external-address:” “C:\Program Files\Storj\Storage Node\config.yaml”

C:\Program Files\Storj\Storage Node\config.yaml:17:contact.external-address: pioneerdj.ddns.net:28967

forgot to add ddns hostname = external ip and yougetsingal show port open.

Any ideas on this?

Shall I shut these nodes down and restart?