Node offline due to outdated software now suspension

My node went offline yesterday (didn’t notice until today) and when I tried to restart I got

2020-10-14T10:55:34.147Z WARN version version not allowed/outdated {“current version”: “1.11.1”, “minimum allowed version”: “v1.12.0”}
Error: outdated software version (v1.11.1), please update

This is the second time this has happened, does it check for latest release version while running and stop if not up to date? there was no reboot etc of the underlying node on either occasion.

I have now updated and restarted and now dashboard says suspension, how long does this last for?

I’ve dedicated hardware to be an operator and ideally I just need to to plod along.

Hello @stretch!

The node will check every 15 minutes for the allowed version to operate on. As in your example, we raised the minimum version recently to v1.12.0. That said, the node should not stop on its own, but keep running until the next “reboot” or crash and refuse to restart from that point on.

Normally, all nodes following the automatic updating technologies we provide/recommend will never run into the problem of getting this outdated. Manual updates are discouraged as we quickly have to support a very wide varity of version and thus slow down progress of driving the protocol and network further.
Please try to follow our documentation to setup automatic updates for your node :slight_smile:

The suspension will last until the satellite verified that you are a healthy node again. If i am not mistaken that should be a process of hours.
Hope this helps!

2 Likes

hi @stefanbenten thanks for the info, for some reason it’s still showing suspension. Looking at logs i’m seeing this

2020-10-15T05:03:08.295Z ERROR contact:service ping satellite failed {“Satellite ID”: “118UWpMCHzs6CvSgWd9BfFVjw5K9pZbJjkfZJexMtSkmKxvvAW”, “attempts”: 4, “error”: “ping satellite error: rpc: dial tcp 78.94.240.189:7777: i/o timeout”, “errorVerbose”: “ping satellite error: rpc: dial tcp 78.94.240.189:7777: i/o timeout\n\tstorj.io/common/rpc.Dialer.dialTransport:211\n\tstorj.io/common/rpc.Dialer.dial:188\n\tstorj.io/common/rpc.Dialer.DialNodeURL:148\n\tstorj.io/storj/storagenode/contact.(*Service).pingSatelliteOnce:124\n\tstorj.io/storj/storagenode/contact.(*Service).pingSatellite:95\n\tstorj.io/storj/storagenode/contact.(*Chore).updateCycles.func1:87\n\tstorj.io/common/sync2.(*Cycle).Run:152\n\tstorj.io/common/sync2.(*Cycle).Start.func1:71\n\tgolang.org/x/sync/errgroup.(*Group).Go.func1:57”}

i’m able to ping that ip

PING 78.94.240.189 (78.94.240.189) 56(84) bytes of data.
64 bytes from 78.94.240.189: icmp_seq=1 ttl=50 time=42.9 ms
64 bytes from 78.94.240.189: icmp_seq=2 ttl=50 time=40.8 ms

and

2020-10-15T08:04:18.361Z ERROR contact:service ping satellite failed {“Satellite ID”: “118UWpMCHzs6CvSgWd9BfFVjw5K9pZbJjkfZJexMtSkmKxvvAW”, “attempts”: 12, “error”: “ping satellite error: rpc: context deadline exceeded”, “errorVerbose”: “ping satellite error: rpc: context deadline exceeded\n\tstorj.io/common/rpc.Dialer.dialTransport:211\n\tstorj.io/common/rpc.Dialer.dial:188\n\tstorj.io/common/rpc.Dialer.DialNodeURL:148\n\tstorj.io/storj/storagenode/contact.(*Service).pingSatelliteOnce:124\n\tstorj.io/storj/storagenode/contact.(*Service).pingSatellite:95\n\tstorj.io/storj/storagenode/contact.(*Chore).updateCycles.func1:87\n\tstorj.io/common/sync2.(*Cycle).Run:152\n\tstorj.io/common/sync2.(*Cycle).Start.func1:71\n\tgolang.org/x/sync/errgroup.(*Group).Go.func1:57”}

`

118UWpM… is stefan’s satellite, it has been shut down. These errors are “normal”.
It will soon be removed from the trusted list (https://tardigrade.io/trusted-satellites), than the errors should disappear.

1 Like

OK great thanks, so I guess it’s just a waiting game for removal from suspension?

The suspension independent for each satellite. So, when this satellite will be removed from the trusted list, it will disappear with its suspension (at least I’m expecting that).

ok well hopefully won’t be too long, dashboard shows this.

still says suspension 100% is there anyway to fix this if not i’ll just just exit as it’s just wasting my resources otherwise.

100% doesnt mean your 100% suspended. Means your node is at 100% if it starts to fall below 100% then your doing something wrong.