Which kind of RAID should I use to set up my STORJ node.....? Which level is the best? RAID 0,1 2,3,4,5,6, etc

Pietro.anon · October 1, 2020, 4:52pm

What do you think about… I have read, RAID 5 will cut in a half my hard drive space, in but using RAID 0 if an hard drive will broke i will have problem in STORJ beause of the lost data… I would like to know which option is the best… thank you

littleskunk · October 1, 2020, 4:58pm

The answer is no RAID is the best. Just spin up one storage node per hard drive and keep it running until the hard drive is failing. It is cheaper to just accept the disqualification penalty than adding additional hard drives for a RAID.

Pietro.anon · October 1, 2020, 5:07pm

ok but i don’t have that capacity on my hard disk, I have 3 disk 320 GB per disk

Pietro.anon · October 1, 2020, 5:08pm

what can i do instead ?

littleskunk · October 1, 2020, 5:12pm

There is no minimum requirement. You can setup 3 nodes with 320 GB each. The reason we are telling people to share at least 500 GB is to make sure they get some kind of acceptable payout. With a 100 GB node we would expect that the payout is too and the operator is going to stop it anyway. We would like to avoid that in the first place and for that reason we expect everyone to share at least 500 GB. I would say in your special case 3 nodes with 320 GB each should be fine for us. Just spend a moment to read how much payout other nodes with similar space are getting.

cdhowie · October 1, 2020, 5:16pm

This is not true. RAID10 would do this. Note that traditional RAID requires all drives to be the same size. If they are not, it treats each drive as though it were the size of the smallest drive in the array.

We typically talk about RAID in terms of N (the number of drives).

RAID0 has no redundancy. Any drive failure will destroy the whole volume. Capacity: N drives. Failure tolerance: 0 drives.
RAID1 is an N-way mirror. Capacity: 1 drive. Failure tolerance: N-1 drives.
RAID5 is like RAID0 but one chunk in each stripe contains a redundant parity value. Capacity: N-1 drives. Failure tolerance: 1 drive.
RAID6 is like RAID5 but with an additional parity value per stripe. Capacity: N-2 drives. Failure tolerance: 2 drives.
RAID1+0 is striping over mirrored pairs, and RAID10 is a specialized implementation of this. Capacity: N/2 drives. Failure tolerance: between 1 and N/2 drives, depending on which fail.

So if you have 3 320GB drives and use RAID5, you will have 2 * 320 = 640GB of capacity (a disk’s worth of data is redundant).

Pietro.anon · October 1, 2020, 5:16pm

thank you for the answare …

SGC · October 1, 2020, 5:18pm

i run raidz1 which is the zfs version of raid5, just better because it has checksums and copy on write.
if you have to ask about what raid you should be running then you are most likely better off without it…

raid can be a very advanced subject, and there are lots of things to take into account… and almost every raid setup will depend on the user and use cases… they are basically never the same, but most often a customized setup fitting what one wants to use it for…

Pietro.anon · October 1, 2020, 5:19pm

ok thank you very much … what will it be if I had 3 disk with different space. one is 320GB other is 160GB an the last one is 500GB ?

cdhowie · October 1, 2020, 5:20pm

Traditional RAID will pretend all of the disks are 160GB and ignore the extra space.

Pietro.anon · October 1, 2020, 5:21pm

so i have a pc with those 320GB other is 160GB an the last one is 500GB are there something i could do?

SGC · October 1, 2020, 5:25pm

with drives that small just make one storagenode pr drive… its by far the best approach until you earn money for better hdd’s

littleskunk · October 1, 2020, 5:25pm

There are ways to compensate that. For example you could combine the 320 GB and the 160 GB drive and then mirror it on the 500 GB drive. Not the best idea but the idea is more to give you an example that beside the good old raid there is now a middle ground.

I would still recomend to just run 1 node per hard drive. Just share them all and take the extra payout as a compensation for the small loss when one of the nodes is getting DQed.

Pietro.anon · October 1, 2020, 5:27pm

you are a great i have read other chat in the storj forum…

cdhowie · October 1, 2020, 5:29pm

Right you could make a “linear” RAID or LVM LV over the 320GB and 160GB disks then put that in a RAID1 with the 500GB disk, which would only waste 20GB of space on the 500GB disk.

You could also use modern COW filesystems like btrfs in raid1 mode which will automatically organize data by mirroring different blocks over different pairs of drives.

littleskunk · October 1, 2020, 5:31pm

Yes that is exactly what I wanted to point out. The only problem with that is the question if these new soluttions are stable. So it comes with some tradeoffs.

Anyway I would still run 1 storage node per hard drive. I am now telling that the third time. I guess I should stop it^^

Pietro.anon · October 1, 2020, 5:31pm

thank you very much for your valuable advices

SGC · October 1, 2020, 5:32pm

zpool status
pool: opool
state: ONLINE
scan: scrub repaired 0B in 0 days 06:37:14 with 0 errors on Sun Sep 13 07:01:15 2020
config:

    NAME                        STATE     READ WRITE CKSUM
    opool                       ONLINE       0     0     0
      mirror-0                  ONLINE       0     0     0
        scsi-35000cca2556d51f4  ONLINE       0     0     0
        scsi-35000cca2556e97a8  ONLINE       0     0     0

errors: No known data errors

pool: rpool
state: ONLINE
scan: scrub repaired 0B in 0 days 00:06:12 with 0 errors on Sun Sep 13 00:30:16 2020
config:

    NAME                                           STATE     READ WRITE CKSUM
    rpool                                          ONLINE       0     0     0
      ata-OCZ-AGILITY3_OCZ-B8LCS0WQ7Z7Q89B6-part3  ONLINE       0     0     0

errors: No known data errors

pool: tank
state: ONLINE
scan: scrub repaired 0B in 0 days 15:25:56 with 0 errors on Wed Sep 16 13:26:48 2020
config:

    NAME                                         STATE     READ WRITE CKSUM
    tank                                         ONLINE       0     0     0
      raidz1-0                                   ONLINE       0     0     0
        ata-HGST_HUS726060ALA640_AR11021EH2JDXB  ONLINE       0     0     0
        ata-HGST_HUS726060ALA640_AR11021EH21JAB  ONLINE       0     0     0
        ata-HGST_HUS726060ALA640_AR31021EH1P62C  ONLINE       0     0     0
      raidz1-2                                   ONLINE       0     0     0
        ata-TOSHIBA_DT01ACA300_531RH5DGS         ONLINE       0     0     0
        ata-TOSHIBA_DT01ACA300_99PGNAYCS         ONLINE       0     0     0
        ata-TOSHIBA_DT01ACA300_Z252JW8AS         ONLINE       0     0     0
      raidz1-3                                   ONLINE       0     0     0
        ata-HGST_HUS726060ALA640_AR31051EJS7UEJ  ONLINE       0     0     0
        ata-HGST_HUS726060ALA640_AR31051EJSAY0J  ONLINE       0     0     0
        ata-TOSHIBA_DT01ACA300_99QJHASCS         ONLINE       0     0     0
    logs
      fioa2                                      ONLINE       0     0     0
    cache
      fioa1                                      ONLINE       0     0     0

my current setup

opool contains a new node because i haven’t gotten it moved yet, it’s a pool for my own stuff and is on a mirror because i had two sas drives that didn’t want to work with the other drives in the raid.

Tank is my main storage pool, it basically runs data across 3 x raidz1’s so that it’s iops in 3x of what a single hdd setup on zfs would run at.

rpool is currently boot, but that will be removed soon to free up another sata port

and then on top i got a slog and running sync always to minimize defragmentation of data on the pool and a l2arc to minimize reads from often read stuff…

ofc this runs more than just the storagenode…
not much currently tho lol…

something people often forget about raid is the iops… most standard raid arrays only get iops for 1 drive worth, which means you end up having like 6 drives or 8 in a raid 6 and they all collectively work on an iops equal to a single drive… which means in some workloads the array is useless

cdhowie · October 1, 2020, 5:33pm

FWIW, btrfs has been pretty reliable since Linux kernel 4.4 (for me anyway), not to get too off-topic. Most of the problems I see these days are actually btrfs detecting corruption caused by hardware failure. If btrfs cannot repair the data from a good copy, it will loudly fail to read the data where other filesystems like ext4 will happily return bad data to the application. So you’ll see more errors with btrfs but that’s just because it knows when corruption happens and ext4 doesn’t. (ZFS also does checksumming and can transparently repair bad data from a good copy.)

littleskunk · October 1, 2020, 5:36pm

My experience is that ZFS detected that 2 of my hard drives are bad and I should replace them. The 2 storage nodes are still running fine and have not been disqualified. I will just take the extra money until the hard drives have too many failures or don’t even start.