Simple adding of more Drives to a node

Flo123456789 · July 26, 2020, 8:42am

Please add a Funktion that the user can add more Drives to one note like in the “StorjShare” Programm.

Derkades · July 26, 2020, 8:52am

You’d be better off running multiple nodes. For example, if you have added 4 disks to your node and one disk starts having issues, you can lose the entire node. With one node per disk you’d only lose one node and continue receiving money for your other nodes.

Keep in mind that you want nodes to survive as long as possible, young nodes (especially < 6 months) aren’t very profitable. By using multiple disks for a single node you increase points of failure (unless you put them in a redundant setup, which is not recommended either because you lose valuable storage space to redundancy)

Flo123456789 · July 26, 2020, 9:44am

And how many nodes can I run on one pc?

kevink · July 26, 2020, 9:59am

As many as you like. There’s not really a limit, at least if you have a 4 core CPU and 4 GB RAM it should be fine with 4-8 nodes.

SGC · July 26, 2020, 1:30pm

another great advantage on having multiple nodes is, IO load put on the HDD… the ingress will be split between the two nodes or how ever many you have, and thus less strain on the drives… however if you are buying new drives make sure they are CMR drives because SMR rarely play well with storj.

another thing which is worth to keep in mind is HDD RPM, 7200 is rarely much cheaper and they have much lower seek time… doesn’t seem like it matters much when one look at the 3-4 vs 2-3 ms seek time, but it does nearly double the IO a disk can perform… maybe not quite but close…

aside from that don’t use usb drives if you can avoid it, and do make sure your drives are rated for 24/7 usage…

enterprise rated drives are much more reliable… not sure how much that really matters at present because the storj throughput is fairly low… an enterprise rated sata drive is usually rated around 550TB yearly throughput with 5 year warranty…

while a consumer grade hdd is rated for 180-220TB yearly throughput and has a max 2 year warranty…

the enterprise drives are in some cases not much more than 20% extra cost… so very worthy of consideration…

ofc some would say enterprise should only run SAS… but yeah SAS prices are much higher because they more or less have redundancy on everything and great diagnostic tools and sensors… there is no doubt SAS is superior, but i’ve been buying sata thus far…

maybe i should make a HDD buyers guide…

twl · July 27, 2020, 1:48pm

He probably didn’t ask for storj to implement a software RAID, I presume what he actually wanted is the ability to add more nodes per machine (1 per drive) within the same GUI

Alexey · July 27, 2020, 7:36pm

Then I would like to suggest you to create a separate feature request for that in the SNO feature request category

Or I can move it to there. But you should explicitly describe what do you want to use.
What do you think, @Flo123456789?

fonzmeister · July 28, 2020, 8:10am

I think that would be a nice feature for the future. One node should be able to manage multiple drives/partitions. Saves quite some overhead.

So for me you can move it. Or I can write one

SGC · July 28, 2020, 9:08am

I like the idea, but it’s simply not a viable concept in present storj…

Really when using more than a 4-5 drives you should be running with redundancy, meaning a RAID type setup, hdd’s are simply to unreliable, and bunching them together in one node, which i’m sure is possible, but doubt that is practical for one you run into the DQ issue, in which a node using multiple HDD’s will have to be able to loss data without getting DQ, and while this from a SNO’s perspective seems totally viable, then it’s difficult to argue against that the further the data loss spreads through the network before being addressed, the more expensive it becomes to repair it… thus the logical method would be to repair it locally, or work with redundancy…

and thus this feature is doomed from the start, physics is literally against it… and not like gravity is against planes, more like the sound barrier… simply to expensive to try to break it for commercial fight to make sense.

Some Reasons / math
whats the point in such a feature, each drive is a point of failure at about 2% annual failure rate.
if we say the avg SNO using such a feature has 5 drives, than means a 10% AFR so 1 in 10 SNO’s using this feature would fail every year, rough math…

ofc the question also becomes how much data can a node loose before DQ… 10%, 5%, 2.5%
if it was 5% or so it would sort of be possible but then one is up to 20 drives and thus 40% AFR.
which ofc wouldn’t matter if it can loose data… but again rebuilding data is difficult, and it doesn’t become easier when it leaves your system, even if it then isn’t the individual SNO’s problem.

kalloritis · July 28, 2020, 6:25pm

I would suggest a read of the Backblaze reliability reports.

AFR percentages are not really additive, like other numbers, as well.

Here’s an excerpt:

Computing the Annualized Failure Rate

Throughout our reports we use the term Annualized Failure Rate (AFR). The word “annualized” here means that regardless of the period of observation (month, quarter, etc.) the failure rate will be transformed into being an annual measurement. For a given group of drives (i.e. model, manufacturer, etc.) we compute the AFR for a period of observation as follows:

AFR = (Drive Failures / (Drive Days / 366) * 100

Where:

Drive Failures is the number of drives that failed during the period of observation.

Drive Days is the number of days all of the drives being observed were operational during the period of observation.

There are 366 days in 2020, obviously in non-leap years we use 365.

Example: Compute the AFR for the Drive Model BB007 for the last six months given;

There were 28 drive failures during the period of observation (six months).

There were 6,000 hard drives at the end of the period of observation.

The total number of days all of the drives of drive model BB007 were in operation during the period of observation (6 months) totaled 878,400 days.

AFR = (28 / (878,400 / 366)) * 100 = (28 / 2,400) * 100 = 1.17%

For the six month period, drive model BB007 had an annualized failure rate of 1.17%.

BrightSilence · July 28, 2020, 7:31pm

I think this conversation is veering in the wrong direction. While I agree with @SGC that adding multiple drives to a single node is a bad idea, it is of course possible to provide GUI software that helps you manage and set up multiple nodes. This is exactly what the StorjShare v2 software did as well. And also what the community provided Toolbox by @vadim does.

I would say something like that should probably be the long term goal. All in one software that lets you set up nodes and create identities as well as monitor them all.

SGC · July 28, 2020, 8:05pm

i think i made my point and the argument is solid, i think you may have misunderstood my meaning of 10%… i am referring to that a node with 5 drives in what is essentially a span will have 10% change of a single drive failing… which will cause DQ.

sure i do the numbers with an margin of error by calling 1.XX avg on most recent backblaze yearly reports for 2% AFR… that way i get a higher rate of failure, but really when calculation failure rates one wants to over estimate rather than underestimate… and keep in mind backblaze vet their drive models these days before the put in mass orders… that will also off set the AFR.

i’m not against the idea, i’m saying physics is, and is a fight one will always loose…

i think @BrightSilence suggestion is one viable option… and it might be the only way forward for this idea….

each drive will either need to be run as a node, or one will need to do some kind of raid / redundancy setup so that one can loose a drive… because drives die…

i have already taken out 3 old drives that i had running 24/7 and casually used for well over 2 years, and still they didn’t survive the storagenode workload for long before starting to throw errors.

that would be enough to take down the entire node, doesn’t even have to be a catastrophic disk failure… a bad cable will give you corrupt data and you have 5 times the failure points in cables…

it’s a bad idea… to have 1 node with many drives and no redundancy… hell i would even argue that it’s most likely not wise running a storagenode on a single harddrive either, because it will only give you grief long term… which is why i don’t…

kalloritis · July 28, 2020, 8:28pm

I was not disputing, just pointing to some large evidence of AFR tracking and how its a lot of “wait and see.”

I agree that its a risk management balancing act when you want to store more data than you drive size allows.
At work, I do RAID all day and laugh at anyone that attempted to say I should just throw my SAN shelves into JBOD mode or have used a hardware RAID card from LSI (currently they’re all RAIDZ2 or mirror’d RAIDZ2 with maximum vdev length of 10 drives).

I think the important thing that has pointed to is that you’re not suppose to instill too much redundancy into your individual setup as a lot of the redundancy is actually upstream of you in Storj itself with the 80 segments of erasure coded data+parity that get pieced out to each /24 network.

While I can understand the desire to just keep expanding your install to earn more, as I myself am currently migrating from 8yro 4x1TB WD Green md-RAID5 setup to a 2x12TB ZFS mirror- Storj will only be getting a portion of that and the rest will be my PC backups, other local NAS duties, and not otherwise taken up by proxmox VM disks.

SGC · July 28, 2020, 9:10pm

i’m pretty new to zfs, but i have settled on a 3 drive raidz1 model, or i did that a month ago… now i changed my mind and next upgrade will be 4 drive raidz1, which i believe should be fine for what i’m using it for, i did migrate through a couple of pools to get to this point already, so getting fairly comfy with zfs.
only have 9-10 working bays in my server currently, so did 3xraidz1 of 3 disks because i want good IOPS and affordable expansion options … doing 10 drive raidz2 i can see working fine on a large system, but without multiplying the number of raidz’s or mirrors will limit the write iops to single disk performance.

storj’s redundancy is their deal… i run redundancy because i don’t want to have to deal will all kinds of issues and already my setup using old drives and old hardware have saved my pool from dying many times, generally because of my limited knowledge, because i had mixed sas and sata drives in the same pool / vdev / backplane, which apparently can and did cause all kinds of issues…

so put my two sas drives in a mirror for my own personal usage, and isolated them on a HBA port.
and now it seems to run fairly smooth, even if i got a 3tb in the 3xraidz1 pool that seems to be showing signs of failing, but haven’t gotten around to replacing it… thus far no errors just high latency…

its very nice to be able to just pull drives or replace cables without having to worry to much about the pool… i wouldn’t mind running mirrors, but 50% capacity loss… thats just a bit to steep a price IMHO, ofc mirrors does make life so much easier and read speed a read iops are the best possible with that setup…

lots of considerations can go into one’s pool designs… but yeah i may have trusted hdd’s a bit to much until 8 years or so ago, and now i hope zfs will be the end all solution i can continue with for a long long time… i mean drives are kinda reliable, but the fact is that a single drive cannot keep data stable long term… and especially not if one starts copying stuff between drives often…

also i only plan on running 1 or maybe 2 storagenodes… and maybe on the same pool
had a pretty good zfs thread going a while back… but seems to have stalled…

had gotten a LSI megaraid controller a really nice one 16i but decided to replace it with low profile HBA’s and then get a external port also since with a 12 bay backplane 12i and go zfs instead…
then i could get rid of my riser card in my 2u server and get all my pcie slots accessible.
not that i use them yet, but then i can upgrade this old box to 10gbit if i want, add a 3rd HBA so i would have 1x 8i HBA 1x 4i 4e HBA and 1x 8e HBA and thus i could hook up a , i forget the number something like 8080 sff and get multiple disk shelves hooked up in a “SAS network” or whatever one calls one of those disk shelf / DAS loops.

SAN is nice, but a bit out of my use cases atm but i do really like the age old datacenter / data storage idea of removing the storage from the servers, makes life so much easier…but had to start somewhere and an nearly antique 12 bay storage server seemed like the perfect test bed to abuse, so i could get my feet wet using real enterprise gear and setups.

pretty happy with that choice… not so happy with proxmox tho, but i will endure it for now… until i get better at it or decided on something else…

kalloritis · July 31, 2020, 7:48pm

Proxmox’s ZFS functionality, as far as the UI, is limited.

Even for creating my 2x12TB mirror, I had to drop to CLI- this was fine though because I’m more comfortable running the zpool commands myself and knowing exactly what is going into creating my pools (is compression turned on and which version, is there dedup turned on, what was the ashift and what pool had block size adjustments for VMs vs SMB/NFS share, etc.).

I also, at another business site, do a lot of work on HCI with Proxmox & Ceph- specifically micro clusters of <10 nodes. Proxmox, also there, does not support a whole ton of the things I need to do (custom ec profiles and crush maps, pool tuning, painless OSD swap/upgrades, etc.) but it gets the bare basics going.