Migrate from RAID5 to RAID6

anon27637763 · June 18, 2020, 6:53pm

I vote for:

Plug in a USB drive and start rsync running…

Of course, it’s all speculation on possible future failure. However, 10TB drive is very likely to fail on array rebuild…

Robertomcat · June 18, 2020, 6:58pm

Yes, it’s the only option I have left, I can only do it through USB. I will use a software that I already use to make the copy in the cloud, the Air Explorer. Thank you!

anon27637763 · June 18, 2020, 7:14pm

It’s usually a good idea to use what you know. I haven’t used MS Windows in a very long time. However, for other readers who may be wondering… I think rsync is usable through the WSL:

Robertomcat · June 18, 2020, 7:21pm

During the process of transferring data to the USB disk (regardless of software) should I have the node turned off at all times?

With the tool that I have put you, I can make the transfer without having the node off. The software reads the whole folder, and then it throws a list of all the files that are there (independently of the changes that are being done live) and then it is to indicate him that it begins the copy. In a second batch I should do the same thing again, but I should already turn the node off so that it really does make the copy without losing any files, then I should not activate the node again until I don’t transfer the data to raid 6 again.

Currently I know that the five hour maximum time for the inactive node is not in effect, but my node will be much longer until I can do the whole process, so I am afraid that disqualification will be imminent. And that the whole process of copying files will work properly, of course. Thank you!

anon27637763 · June 18, 2020, 7:25pm

As far as I know, downtime is not a disqualifying parameters at this time. So, if you’re going to perform an operation that requires significant downtime, it’s a good opportunity to do it.

Worst case scenario, you lose some data and your current node gets disqualified. In that case, just start up a new node ID running on your new and (hopefully) less risky RAID 6 array.

BrightSilence · June 18, 2020, 8:39pm

I’m not familiar with the software you are using. Keep in mind that it’s also important for that sync software to remove the files that are no longer in the source location. If it can’t do that or you’re not sure, I would opt for stopping the node and doing a full copy.

Alexey · June 18, 2020, 8:53pm

Or use a robocopy /MIR

Robertomcat · June 19, 2020, 7:56am

Yes, it does. It has many configurations, and you can do it both in the cloud and locally, the truth is that it is very complete for Windows. Thank you!

Robertomcat · June 21, 2020, 10:55am

@ BrightSilence Good morning! And the option of using a RAID 50?

BrightSilence · June 21, 2020, 11:28am

Good morning to you too!

You mean combining two unreliable RAID5 arrays in (scary) RAID0? I think you can answer that question yourself.

That’s like building a house of cards on top of two other houses of cards.

Robertomcat · June 21, 2020, 11:33am

haha yes, it would be a huge disaster. It’s just that there’s no nest matrix that can be optimized for space.

The best for eight disks is still RAID 6, the reads may be a little better than RAID 5, but I don’t know if there will really be a real difference in terms of random reads, which is what storj really needs.

BrightSilence · June 21, 2020, 5:19pm

Read performance is pretty similar between RAID5 and 6, but writes will be worse on RAID6. You could overcome that with RAID60 in larger arrays, but that will mostly help with larger writes. You could of course also add an SSD write cache. But honestly, neither should be needed for Storj.

Robertomcat · June 21, 2020, 5:44pm

The P440 already has 4GB of RAM that I have assigned all to write on RAID 5.

But I am very concerned with the high cancellation rate I have, it is 97%. I am comparing with other nodes in a Telegram group here in Spain, and the truth is that my result is disastrous compared to other people who may have a cancellation ratio of 60% or 70%.

I have pinged many sites and my whole network seems to be perfect. How could I solve this terrible cancellation rate? Because it demoralizes me, having a node with 2.3 TB of storage, and quite powerful hardware.

BrightSilence · June 21, 2020, 6:47pm

There’s probably nothing to solve as it’s mostly a logging problem. I would be surprised if a large amount actually lead to files not ending up on your node. In my case it seems almost all of the “canceled” pieces end up on my node and being paid anyway. I wouldn’t worry too much about it.

Robertomcat · June 21, 2020, 6:51pm

The Asia, US and Saltlake satellites are geographically distributed where they indicate each? I’m also thinking that I have a lot of files on these three satellites, and I may not be able to compete geographically with people who live near these three satellites that are distributed “I think” on these continents, could it be?

BrightSilence · June 21, 2020, 6:55pm

I’m in the Netherlands and I don’t see that much of a difference between satellites. The location of the uplink(customer) is actually the important part, though it’s reasonable to assume most customers would pick a satellite close to them.

SGC · June 23, 2020, 6:22am

it’s why big data has been quite fond of nested raid 10
i decided to go with a pool / array consisting of 3 raidz1 (raid5 similar but better) using 3 disks each… this isn’t raid 50 it doesn’t stripe data across the 3 raid5 but dedicates it depending on which one needs to keep the pool balanced… (i’m sure there are many criteria it looks at)
and since it has checksum and uses copy on write then it doesn’t need a 2nd disk to tell which one data is wrong.

sure this does give me only 6 disks worth of data on a 9 drive setup… however i will get about 3 times the usual raw IOPS
when calculating raid speeds the general rule of thumb is you can simply think of a raid stripe as being 1 disk, because it will stripe across the drives and thus they needs to be in sync, so your IOPS will be that of one drive in raid0,1,5,6
the read and write is your number of drives subtracting your redundant drives.

ofc this is a general rule of thumb, not a perfect method…
i know there is a little risk to running raidz1, but i kinda like my setup… and only way to keep the ratio with raidz2 would require 18 disks to achive the same raw iops.

i may change my mind in the future regarding raidz1… but thus far it’s performed pretty damn good and i’ve been very mean to my first zfs pool setup, also if you do have high latency on your array you may have a problem with one drive… that will slow down the rest… sometimes to a crawl…

the raid10 setups gives one even more raw iops for the same drives… ofc the capacity drops from 2/3 to 1/2… so thats a big thing… i’m sure raidz1 with 4 drives are perfectly valid for usage also… but i digress… with raid10 you also get twice the read pr raid1, writes are limited to 1 disks worth, and iops are also 1 disks worth and then you ofc multiply by how many raid 1 you put together… then eventually they go into like nested raid 10 … but i think that will reduce the capacity even further… but it does give you a ton of performances… a ton xD
i would recommend most people to switch to zfs… if they run linux…and are a bit technical minded

Robertomcat · June 23, 2020, 10:05am

Hello, good morning.

I have a total of eight hard drives, and any combination other than RAID 6, I’ll lose capacity. You could leave RAID 5, but in time you probably know what can happen.

If I want to keep half of the capacity I currently have, then I could set up a RAID 10, but 80 TB would leave me with 40, and then I have to subtract when Windows does the formatting, that’s a terrific loss of storage.

So, when I do a total cloud backup, I will break RAID 5 and build a 6, which in reading performance will be very similar to RAID 5. Thank you!

BrightSilence · June 23, 2020, 10:11am

This is roughly true for reads, but definitely not for writes. Writes require parity information to be calculated and written to the parity disk. Depending on the RAID implementation this could be a moderate to severe impact in write performance.

For RAID5/6 parity information is spread across all disks in the array, meaning that for every block the parity information is on different disks. This overhead is therefor also spread among all disks in the array. It also requires some other system resources to calculate, especially for the more complicated RAID6 parity calculation.

ZFS is a different beast and I don’t really know enough about it. But writes to the actual array are likely slower as ZFS I believe stores parity information on dedicated parity disks. This could create a bit of a bottleneck. However, ZFS writes to ZIL prior to writing to the actual array, which by default is placed on the disks in the array, but can be moved to a dedicated device to speed up this operation. That cache can be saturated though. There are people who are way better versed in ZFS to comment on the exact write performance impact. But in short, write performance is not nearly as simple as the number of disks - parity.

SGC · June 23, 2020, 1:22pm

sure if you want to argue about maybe 20-30% performance in either direction and between different technologies… it’s just an easy to get a rough estimate about where a certain raid setup will be at compared to something else…

generally zfs uses more iops because of its checksum/parity data… ofc less iops will also give less speed and there are ofc some nuances… a whole lot of them

from what i have seen on benchmarks of raid5 vs raid6 they are pretty evenly matched if the array has the same number of “storage” drives…
zfs doesn’t use dedicate parity disks, it can do parity on the single bit or byte level and error correction of that, but anything more it relies on finding the error with checksums and replacing the block/record with a record from another drive…

zfs raid is a bit of a nightmare if one wants to dig into the details of how the raid levels work…
because it has variable recordsizes… meaning that if you write in 512k recordsizes like i do.
then if you write a 10k record… like say a file… then it will only take the 10k + checksum + redundancy.

hench block / records become a mismatched mess, which is why removing and adding disks to a raidz is very very difficult in most cases…
so the easiest is to add them in whole raidz’s to the pool… ofc one added they are quite difficult to remove again… so thats fun

that problem aside, it is a great filesystem/volume manager

@Robertomcat

i don’t enjoy paying my 33% capacity for my 3x3raidz… but it’s in my case the easiest way, highest performance way and safest way to run my setup…

granted i could go with a raidz2 which is the zfs version of raid6, but then i run into the whole capacity and performance issue… so my 9 drives could be 2xraidz2 (raid6) but then i’m down to mirror capacity but none of the advantages… i could do a raidz2 with 9 which leaves me 7 drives for storage… so 2/9th of my capacity used on redundancy… which is very good…

but the iops… would only be like if i had a single drive hooked up to zfs… sure i get redundancy and high capacity… but if i ever want to copy out my node… it might take weeks…
especially if one drive is acting up and the array / pool i being used…

sure i loose a good deal more in 1/3 wasn’t really my first choice… but even if the pool is being used and i got a drive that is acting up… then migrating out of the pool can still “easily” be done… because zfs will simply balance the load between the raidz’s so if it needs to read on the bad raidz then it will direct ingress to the other raidz’s

if you have a drive that is acting up in a raid6 then the entire array will slow down to match it to keep in sync… ofc then you can pull it and run raid5
but often one isn’t aware of this kind of stuff… one might simply have started a very important migration as one was about to leave for the weekend… and then comes back to it being 30% done…

but hey you got a nice cache on your raid controller and you can most likely throw in or have throw in a ssd for caching also, i’m sure it will be fine to run raid6

i do like to know i got some decent iops, and ofc that i can expand easily by adding or upgrading 3 drives at a time… makes my setup a whole lot simplere to manage long term