RAID vs No RAID choice

Hogion · May 20, 2020, 4:37pm

Not yet. I’ll have a look this week-end.

Hogion · May 24, 2020, 8:35am

I’ve “played” with your spreadsheet with different HD configurations. Your spreadsheet use useful.

With my two hard-disk, the choice of RAID or No-RAID is dependent form the “daily netto ingress”. Nevertheless, if you ant to have high revenue in long term, it seems that No-RAID should be the right choice. RAID should be used if you want to insure your incomes for the 15 first months.

A reasonable answer to choice of RAID or no RAID is still : it depends … and you should make your own assessment (the spreadsheet of Wolfgang use useful for this task).

node1 · June 6, 2020, 1:47pm

Thanks for the sheet. I miss some information.
RAID version needs 1 CPU for all drives, as this is 1 node only.
While NO RAID version with 8 drives should have 8 to 11 cores.

In my case, i run raid nodes on i5/i7 CPU’s with 4 cores. This means i can only run 3-4 nodes per computer. For utilising more drives, i have to build another computer. As i have them, lets asume it does not cost me at all, but for some SNO’s this will be additional expenses. And of course the electricity will rise up, as i have to run two PSU’s, MB’s, CPU’s.

If node can’t share single core with OS, then probably we need 3 CPU’s (if each has 4 cores). 8 cores for 8 drives and 3 cores for OS, asuming, that 4 cores CPU can run only 3 nodes + 1 core for OS. Then we have tripple usage of electricity for PSU, MB, CPU. And this scenario is more likely to be real, as STORJ targets decentralised home/office users, but not industrial equipment with enormous count of CPU cores in the single CPU. So we are talking about at least +50W or +100W of electricity for additional 1 or 2 computers if using NO RAID setup.

kevink · June 6, 2020, 4:00pm

1 core per node isn’t a hard requirement. Full nodes need almost no CPU and even very active nodes don’t need that much (it used to be higher during initial beta iirc).
So I’d say with 4 cores you would still have no problem running 16 nodes, because 15 nodes will be full most of the time. Additionally you don’t get more ingress/egress from 16 nodes vs 1 node so there really is just a small overhead in having additional nodes. If your hardware would be fine running one large node with 16 drives, it should be fine with running 16 nodes too.

cdhowie · June 6, 2020, 4:57pm

Nope, not even close. Recall that the ingress traffic is divided by all of the nodes. This means that the CPU burden is spread across the nodes as well.

The same is true for egress, since each node is managing less data.

Put it this way: 1 8TB node is going to use roughly the same CPU as 8 1TB nodes because the ingress and egress activity will be fairly well spread over them. Even if one node is “hot” on egress, a single node would have also had those same “hot” pieces and would be spending the same amount of CPU to serve them.

Beddhist · June 9, 2020, 1:16pm

This is a fascinating discussion, even if you lost me about half way through with the stats.

However, it all seems to hinge on the assumption that the manufacturers’ failure rates are correct. I get the impression that most ‘consumer drives’ have a bit failure rate specified of 10^14, while most enterprise drives magically are 10 times as reliable. I haven’t actually verified this, but it seems to apply to all brands. When you look at Backblaze’s stats you can see clearly that these numbers have been pulled out of a sales person’s hat. That number would also mean you can read all the data off a 14TB drive only once, on average, before encountering a bit error.

Drive capacities have steadily increased over the years, but have these numbers actually been changed? Modern bigger drives certainly don’t seem to fail any sooner than older drives did.

Without obtaining some hard and fast failure data I think we can argue over this until the cows come home. The only hard data I have seen is Backblaze’s and they seem highly model-dependent.

BrightSilence · June 9, 2020, 1:43pm

Backblaze doesn’t report URE’s but whole disk failures. Those are completely different things. And since they use their disks in redundant arrays, URE’s happen without being noticed, because the array would simply fix that.

The rest of your post conflates these two things as well. They’re not the same thing.

Ps. Please show me a single sales person that shows lower stats than the actual products performance?

Beddhist · June 9, 2020, 2:04pm

If their RAID fixes URE’s, can we use that, too? Did they roll their own?

If I understand it correctly, ZFS RAID also does this? Is that the solution, then?

BrightSilence · June 9, 2020, 2:18pm

All RAID with redundancy does this. The sector would be replaced by the HDD and the data corrected based on available data on the remaining disks. The problem arises during a rebuild of RAID5 because at that point there is no redundancy anymore, since one disk is already missing.

You can use RAID6 or RAID10, or ZFS with 2 or 3 disk redundancy. I can’t say for certain, because I don’t have inside knowledge, but I can say for certain that Backblaze doesn’t use RAID5.

Beddhist · June 9, 2020, 2:44pm

That’s a lot clearer now, thanks.

I think I read that ZFS doesn’t fail the array with a URE during rebuild, but you lose one file and it logs that. So you could now restore that file from backup, then replace the next disk. If I understand and remember correctly… Must read up on this.

BrightSilence · June 9, 2020, 2:48pm

It has that capability yes, but probably depends on settings. You can do something similar on mdraid I believe, but I think it defaults to failing the rebuild. Either way you still have data loss.

cdhowie · June 9, 2020, 3:24pm

Backblaze has a summary of how their infrastructure works. The tl;dr is there’s two types of storage pods. One uses ext4 on mdadm RAID6 and another uses Reed-Solomon to split each file into 20 pieces (any 17 needed to reconstruct the file) and stores those on 20 different servers, which use ext4.

Reading between the lines, I believe the RAID6 pods are considered legacy and most/all of the newly-deployed pods use Reed-Solomon.

node1 · June 10, 2020, 11:41am

I would not be so sure.

1 node will use 1 database and 1 docker instance.
while 8 nodes will use 8 databases and 8 dockers simultaneously. RAM usage will definetly be much be higher, as well as CPU load.

Your example maybe can be right only talking about network load. But even in the network case, there might be 8 times more “technical queries” as besides plain DATA each node will probably have some queries to the sattelites, something like audits and etc.

If we exclude 8 different databases, if we exclude 8 docker istances, if we exclude 8 of each nodes technical queries to the sattelites and will leave only pure data and it’s resources, then in my opinion, you would be right.

BrightSilence · June 10, 2020, 12:00pm

Docker containers add basically no overhead. That’s what they are designed to do, so don’t worry about that part.

As for the 8 databases, sure, but those 8 databases will each have 1/8th of the amount of data in them. There is of course some overhead to any additional process that needs to run, but since they are basically spreading out the work that needs to be done as well as the data that needs to be stored. Each individual process would need to use resources for only the work and data they are responsible for.

Almost all of the work the node does scales with the number of transfers and since that total number of transfers doesn’t change with the number of running nodes, they will use the same resources. What remains is the other processes, like sending orders, checking for versions, garbage collection etc. Instead of doing that once, you do it multiple times. However, the amount of orders and data for garbage collection again scales with the work done by each node. So even that is mitigated somewhat by spreading the work across nodes.

If the node software is programmed efficiently (and from what I can tell so far, it is) there would be a negligible difference between handling the same amount of traffic with 1 or 8 nodes.

Wolfgang · July 15, 2022, 12:50pm

I cannot upload files here. How would I do so?

Last time I sent it to someone who had clearance to uplad files here.

I’ll upload onto my Server to share. Though the link will be invalid in some weeks…

Wolfgang · July 15, 2022, 1:03pm

—link deleted
see threads below

BrightSilence · July 15, 2022, 2:04pm

Have you heard of Storj DCS? First 150GB is free!

jammerdan · July 15, 2022, 2:08pm

You wanna try: transfer.sh which is using Storj DCS for storage (at least it has been using Storj DCS as storage I don’t know if they currently still do)

Wolfgang · July 19, 2022, 5:23pm

Indeed I had not “the time yet” to install DCS properly. But the hint with the transfer.sh is nice.

Here you go

Raid-Comparison_v04.xlsx

or in my own DCS
Raid-Comparison_v04.xlsx

Wolfgang · September 1, 2022, 8:21pm

Hi IsThisOn,

odd. I can not confirm your contention. Here it works fine.

4 disks each 8TB makes 24TB in Raid5 and 16TB in Raid6.
20 disks each 10TB makes 190TB in Raid5 and 180TB in Raid6.