I have been running a 6 TB node for a bit over two years with uptimes of over 99.5% . Now I was in the process to move my node to a new NAS and “new” old HDDs. Unfortunately in the process of migration the storage pool failed beyond recovery and the node has gone offline.
My question is, does it make sense to try to recover the node from the about 2 TB (out of 6) data I already have moved? Or just start a new node and go through the whole vetting process and build up reputation again?
The node will not recover if it has only 2TB out of 6TB.
The satellites expects 6TB to be present and online on your node. They will audit randomly the pieces. If an audit fails with the reason that the piece is not found, your node will be disqualified for that satellite fast.
You can try though. Nothing prevents you to start a new node and run it alongside the older node (different ports of course). So you can watch how quickly your old node gets disqualified and then move the new node over.
I haven’t yet been able to restore it. It’s a “high risk” ZFS stripe (RAID 0) setup of two 4 GB disks. On disk has a I/O error and I can’t import the pool anymore. I’ll try again later, maybe I have some luck.
No, just to get more space in one volume. I actually used to run the node on a RAID1 but switched to RAID0 when the payout was adjusted to get more node space from the same hardware.
I believe he tried to avoid the vetting process and held back ammount for a second node, so he gone againt devs recommendations and expanded the volume to a second drive.
One node per drive is one of the main recommendations for running a node, and is not obsolette, and is there for a reason… which you just discovered.
Instead of loosing 6TB, you could have just lost 3TB.
I keep seeing this misconception with reputation… many SNOs just take it to literaly.
Is not like in real life that your entire history matters for building a reputation.
It’s an audit based reputation that just adresses the last 30 days of the node, no matter how old it is. So if your node was crap 2 years ago, and in time you upgraded it and managed to keep it online like 99,99% in the last 30 days, your crappy reputation has been forgotten and you are a 5 star storage provider.
And the score of a node you have is not influenced by and is not influencing the other nodes you have. You can use and you should use the same email and wallet address, regardless of how many nodes you run and what scores they have.
Another thing, your node is disqualified and you can delete it and start over with a new token and a new identity, if the node lost more than 4% of the data for each satellite. If a sat has less than 4% lost data, it will recover, even though the others will DQ your node. Just delete theier data with “forget satellite” command to make room for new ones.
The vetting process is not such a big problem; it takes between 15 - 30 days if the node is alone in the /24 subnet.The held back also not a big problem these days with such low activity.
Going forward, I suggest to delete the old data if recovering is impossible, un-RAID the drives and run them in the classic way, one drive per pool, ext4 fs, and start one node on the first drive. Wait for the vetting to be over, by watching it with BrightSilence’s script, and than start the second node on the second drive.
Thanks for the extensive comment! I didn’t know about the recommendation of one drive per node (or ignored it). I will take that in account and setup up multiple nodes in the future.