Node had to be offline for a few days. Continue or restart node from fresh?

Hey everyone,

My 3 nodes recently got vetted and they’re getting much more ingress (and also egress). However, the three nodes were overwhelming my Raspberry Pi 3. Since, my nodes were running not stable anymore, I made the decision to turn off the third and newest node, so that the others can survive.

After around a week of only running the first two nodes, my new Raspberry Pi4 arrived and I migrated these to the new Pi4. I then started the third node on the old Pi3.

I knew that I’ll get bad online scores for the third node, since it was offline for around 5 days just when it reached 100% vetting on US1/EU1. These are now my online scores after coming online again:

From my quick estimation, I’m getting around 40 GB ingress per day on the third node (despite being offline for so long). It could be however, that the ingress will be reduced after a while when the status has updated the ingress algorithms.

I could alternatively just wipe the node and start a new one. After all, it took only a month of vetting - and I’ve a quite robust setup where I won’t make so many changes in the first month. But then, I’ll be again at just a few GBs of ingress per day for a full month.

What do you think? What experiences did you have? Would you continue or wipe the third node?

1 Like

Your node can be offline up to 12 days until be suspended for downtime. The suspension means that your node will not have any ingress until the online score become greater than 60%. To fully recover it need to be online for the next 30 days. Each downtime requires another 30 days to recover.

So, the low online score is not necessary will lead to disqualification (only if your node was offline for more than 30 days), it can recover if you fix the offline issue. However it’s not for free - while your node offline, pieces on it considered as unhealthy after 4 hours offline, and if there is a repair job got triggered, your pieces could be recovered to other nodes and as soon as you bring your node online, and the garbage collector got triggered, it will move all such pieces to the trash.

In other words, the more your node is offline, the more data will be moved to the trash.

2 Likes

Thanks for the clarification. I didn’t know about the 30 days thing. So this is actually more generous than what I thought how it was working. Though I do agree with you, that every downtime has it’s consequences and should be avoided.

Could this information be added to the FAQ perhaps?

I think these questions related to the suspension/disqualification occur frequently in the forums and you could save yourself some time (given how often you’ve had to explain it.) (Unless you have good reasons to not add a FAQ for this.)

This is a current implementation, but our goal is to have less than 5 hours downtime per month. So, this requirement is in place, because it can be enforced in any time.

:joy: :joy: :joy:
Seriously?
I recently moved a node. The final rsync took longer than a day.

1 Like

It’s not changed, so I guess - yes :slight_smile:

I hope we do not reach this goal, beacuse this would have killed my main node long ago. :sweat:
and ervery node on non-company-grade internet also.
(statistically within 4years, where 98% availability is normal for private lines in germany.) :fearful:

1 Like

I did multiple consecutive syncs, until the sync time became less than 1 minute, and only then shut down node and did final sync, which also took one minute.

The instructions Storj posted need to be augmented, as on large nodes, one sync is so long, that newly added data required another non-trivial amount of time to sync.

Anyone, individual or company, can have more than 5 hours of downtime in a month, in one point in time. Just take a look at Facebook’s history or other major company, that got attacked by thirdparties. Those 5 hours are just a goal, not a practical rule.

3 Likes

My node has maintained an excellent uptime score for over 3 years. I lost my connection today, and my ISP is telling me that fixing it can take 24 hours. (IKR, WTF)

A hard 5 hour limit would mean the death of my node right now. I’m not sure if that would be in anyone’s best interest. Well, that’s just my perspective ATM.

Id keep an eye on your score , and go from there, hope that the internet comes back . Idk if its possible use wifi mobile data ?

Please just bring your node back online asap, the 5 hour limit is outdated. The node won’t get disqualified that fast. You just should be sure to keep it online continuously after you bring it back so that your reputation will recover in the following 30 days.

3 Likes

You can use WiFi or mobile internet if their providers allows to perform a port forwarding (usually it’s a paid feature, if you do not own that WiFi router). Of course the higher latency or instability will reduce the node’s usage.
In case of mobile operators it’s a very rare feature, usually you will have a CGNAT there, so only VPN services with port forwarding feature such as portmap.io, ngrok, PIA, AirVPN, PureVPN, etc.

1 Like

Not really towards this post, however i did a tracroute to see how many hops i have , i get over 30. is that normal? and does that mean im behind a CGNAT?

From the SNO perspective that’s mean that your node will be behind the provider’s router with NAT placed in their facilities. So to make a port forwarding, you need to do it on provider’s router in their facilities. Not many ISPs would allow you to change the settings in their hardware, some can make such a rule for you and sometimes even for free, but it’s soo rare.