RAID or no RAID: the forgotten premise

I just read through a lot of threads debating wether running one node per drive or running one node on a big raid array is better (i.e. more profitable).

Seems the general conclusion is that one node per drive is more profitable.

I just want to say that this conclusion is based on a certain premise. Changing the premise changes the conclusion.

So, don’t go and say one way or the other is better, just consider the initial premise and conclude accordingly. To demonstrate my point I state one case for which it is much better to use RAID than one node per drive.

When you have access to cheap but less reliable drives, it is more profitable to use RAID. That is because the startup time of a node with a new identity puts most profits at risk when you know a single drive is not going to last more than 6-12 months. A system using a RAID array retains a long-term identity and stacks profits instead of starting over all the time and losing withheld payouts.

Just my 2 cents…

That premise may change again if the vetting process is changed in order to allow for bringing capacity online faster… in which case one-node-per-drive becomes more sensible again :slight_smile:

3 Likes

you just demonstrated my point further. change the premise, change the conclusion.

Now imagine: you can find drives at 50 cent per TB. But you know they do not last more than 6 months.
RAID or no RAID ? :man_shrugging:

No RAID, you sell them all to me for $1/TB (=$20 each 20TB drive). You just doubled your investment!

1 Like

For me… because I’ve experienced way more gradual-degradation failures from HDDs… than immediate catastrophic failures… my premise includes the assumption that most HDD failures will be recoverable. Meaning at the end of they day I may still be throwing the drive in the garbage… but I would have had the chance to pull enough data from it to save a nodes Identity and enough data to continue to pass audits. ddrescue FTW! :wink:

So my conclusion is to waste no space on parity/mirroring/availability. Even though I understand the desire to protect full nodes so all the time spent filling them doesn’t go to waste.

(Edit: but I’m still a fan of keeping some incubated nodes around. If you do have a hard failure - may as well restart with auditing and holdback already complete)

2 Likes

oh I did not think of that, run some extra nodes with small space just to get an identity ready to go…
I might spin a few extra jails just for that purpose, to incubate some ID like you say.

1 Like

Another reason to use RAID is to be able to use the drives for my own data. Storj used to say that node operators should use the hardware they already have and only what would be online 24/7/ anyway.

3 Likes

One should be happy to use raid on a per node per disk member basis for growing nodes. Storj has so much dependence on file-walkers, the read advantage and less so the write advantage of spanned nodes can easily cut that operational time by 2-12+x depending on your configs. Thereafter, I’m happy to transfer nodes out to 22+ TB pastures, it just depends on your application of your total resource scalability.
Though I throw those 2 cents in.

I put three disks in raid 0 to maximize space and performance, works great and less hassle setting up three nodes. If a drive dies its over, that’s ok I have other nodes.