Adding more TB to my node

Lothios · February 25, 2020, 11:26pm

Is there a way to add more hard drives to my node? If so how would I do that?

S0litiare · February 26, 2020, 2:25am

it depends on your current setup.
Easiest way is with a larger HDD (i.e. going from a 1Tb to a 3Tb HDD) and using a program like rsync :

Pac · February 26, 2020, 6:58am

The recommended way of adding new storage drives is to create a new node (and identity) per new drive.

Nodes can sit on the same device (if powerful enough), behind the same IP, but their traffic will be split (i.e. you won’t get twice more traffic if you setup a second node at your place).

Rabinovitch · February 27, 2020, 7:26pm

Oh, God… So I can’t just add a mounting points related to newly added HDDs under same identity when starting the node?

Pac · February 27, 2020, 7:59pm

@Rabinovitch No, I’m afraid…

Alexey · February 27, 2020, 9:58pm

Hello @Rabinovitch,
Welcome to the forum!

No, you can’t do it with storagenode, but can with your OS. However, it will be RAID0 - with one disk failure the whole node is lost.
So, it’s better to start a second node.
If you concerned about possible failure more than about costs, you can create a RAID with parity - RAID5/6/10

Rabinovitch · February 29, 2020, 10:19am

Thanks, I’m about to build a RAID5 right now. ) Not sure if RAID6 better for 15x10Tb disks… I hope RAID5 will be enough.

Vadim · February 29, 2020, 10:26am

better make more nodes in differet locations, you never fill up yout 125TB node.
And it very high risk, if raid go error, it will kill all data, or just some other error.
Even in 1 place it more secure to make more nodes. Software errors also hapening.

Rabinovitch · February 29, 2020, 10:28am

Let’s try and see. Who if not me? ))

Anyway, I would glad to read some other opinions.

Is there any statistics on average stored data per node?

Vadim · February 29, 2020, 10:30am

i have nodes with 4 tb, and in 6 month it not full yet.

Alexey · February 29, 2020, 1:27pm

It’s a very bad idea, especially with large disks:
https://www.xbyte.com/blog/post/dangers-of-raid-5-array-with-sata-drives/

When the one of the disk will fail, you would replace it and start a whole rebuild of the array. With today’s disks it’s guaranteed that at least one disk will fail during the rebuild and you will lose a whole array.
Please, NEVER use RAID5 with large consumers disks - it’s a time bomb.

Vadim · February 29, 2020, 2:35pm

i heard also that it not recomended also to use in raid disks from same part. thay ured to fail at saim time.

Rabinovitch · February 29, 2020, 5:02pm

What a nonsense! Did you mean resync? Building the whole array should take about 15 hrs in my case. Why should a modern disk (another one) fail during 15-20 hrs right after first failed disk?
Anyway, even if I will use RAID6 (is that OK?) - what is an “optimal maximum” for Storj node volume?

Alexey · February 29, 2020, 5:06pm

Because of probability to have a ureadable bit. With large disks it’s almost 100%. It’s enough to fail the rebuild. The RAID5 system must read ALL bits to make a parity.
I have had a lot of problems with RAID5 arrays in the past. Please, consider to use RAID10 or at least RAID6, if you want to waste disks to RAID anyway

Rabinovitch · February 29, 2020, 5:13pm

I just don’t want to “have a sex” with many of independent nodes. But may be I will start with 2-3 nodes with 10 Tb per each.

Alexey · February 29, 2020, 5:21pm

I can understand. Each selects an own way to make itself busy.
The RAID5 on large disks, especially in such high grade is a proven solution to have a much of time to be busy. Good luck!

BrightSilence · February 29, 2020, 8:02pm

I agree with Alexey, RAID5 is way too risky. However, I have rebuilt RAID5 arrays in the past with 8TB disks. I don’t agree with the “almost guaranteed to fail” part. But it’s more risk than you should be willing to take. Honestly with 15 HDD’s even RAID6 seems a little risky. I don’t like more than 10… maybe 12 HDD’s in RAID6 as the risk goes up the larger the array is. I know ZFS allows you to add even more parity disks. It may be an option for your setup. Beyond that you could even consider a RAID60 setup, which would put 2 or more RAID6 arrays in a RAID0 setup. But… it’s still very unlikely that all of that space will fill up with any setup. Nodes that have been around since day 1 have accumulated less than 5TB so far. Just keep that in mind.
My advise would be to start with 2 or 3 nodes. Start them one at a time to prevent them all having to be vetted at the same time. Start one, wait until it’s vetted, then start the next one. I don’t know if starting more than three right now is useful as they’ll be sharing data and you’d just be running more HDD’s drawing more power for no good reason. You can always start a new one when the previous ones fill up.

twl · March 3, 2020, 10:17am

Because the probability of failing is measured in bits read, not time spent reading…

Also, this whole issue has nothing to do with the drive being connected via SATA as all those articles imply. There are of course SATA connected drives with lower read error rates than cheap desktop drives. Please stop spreading the urban legend of “SATA” being the problem here.

anon27637763 · March 3, 2020, 12:32pm

bits read in comparison to drive size…

Standard consumer desktop spinning platter HDDs have a rated value of 1 error in 10^14 bits.

Typical enterprise drives are rated at 1 error in 10^15 bits. And some more expensive drives go as high as 1 error in 10^16 bits.

So…

Consumer drives

10^14/8/1024/1024/1024/1024 = 11.37 TB

Enterprise drives

10^15/8/1024/1024/1024/1024 = 113.69 TB

I have been running RAID 5 on consumer drives for as long as I can remember. I have have typically 1 or 2 drive failures per 3 year time frame. All of my drive failures have been physical failure of the heads or disk.

Typical desktop drives are rated for 50% or less power-on time. So, running a desktop drive in a server configuration such as Storj will exceed most consumer desktop drives rated power-on time and data transfer limits. Since this is true, running consumer drives in a RAID configuration is less risky than running them 24/7 … The drives are much more likely to fail due to stress of just running it 24/7 than reading 11.37 TB during a potential array rebuild… especially if the drives themselves are less than 4 TB and only 60% full.

And please note, the reported problem is only a problem during rebuilding of an array.

Alexey · March 3, 2020, 6:29pm

There are difference between consumer (usually SATA) and enterprise (usually SAS) disks, which is described by @anon27637763 above