Performance: Optimal RAID Stripe Size And Read Ahead For HDDs In RAID 50?

As title says. What is best for STORJ node performance? A stripe size between 128KB and 256KB would be better when accesses are typically of small files. What I see the node hosts is many small files. What about read ahead?

yeah there are like multiple millions of files early on with nodes.
i would go with 128KB stripe basically the least possible to reduce overhead.
read ahead i think would be a yes, really the best thing you can add is an ssd cache.

personally i run 256KB recordsizes on my ZFS Raidz’s but ZFS has dynamic Record / stripe sizes.
default on ZFS is 128KB, but 256KB runs a bit better for migration.

best thing is to test it out, most setups will behaves in their own way, because nobody has the exact same setup, so one can only really follow general rules.

but yeah nodes has a ton of small files, they seem to get written and grow to size and then end up as 2MB files… so any sort of write cache does wonders for performance.

I run hardware RAID with FBWC on HP machines, so I guess the defaults are good. I use Ubuntu and the default read ahead is 256. I would prefer to use RAID 50 because of the maximum storage capacity and better redundancy than using RAID 5 alone. Just in case… Cannot argue that RAID 10 is better for the write performance, but I believe in this scenario for STORJ node - it is pointless to waste space and use RAID 10, so 50 is my thing, definitely. As for testing it, I don’t really know how to because I have a case where a node with same HDDs, same RAID setup, more RAM and more and faster CPUs performs as twice as worse as another node with less RAM and lower CPU core count and lower CPU frequency. I know it may be early to say, but… Anyone, who actually experimented with performance optimization? I also set Static High Performance Mode in the power regulator. I did also set performance as scaling governor, enabled overcommit, transparent hugepages are set to madvise.

a storagenode will run on a toaster, as for ram it doesn’t use to much either a node running correctly will use less than 1GB total.

if the storage cannot keep up, RAM usage goes to the moon… even my 16TB node only uses less than 512MB RAM on avg and uses like 3-5% of a thread on the cpu.

latency network / internet / storage is generally what is the performance of a storagenode.
the worse it gets the less uploads and downloads succeed, meaning less profits and slower growth.

else it’s all just dealing with the semi substantial iops a node can use, which is what usually causes the latency for the storage side atleast.
which can be tough for some single drive SMR hdd nodes.
even most nodes running on a single hdd has successrates in the 97% or better range… usually…
also depends a bit on network load.

but really there isn’t much to improve upon usually, unless if a node is under performing which is abnormal… and it’s almost always the storage iops that cannot keep up.

by testing it means, run a node for a while see how it goes on your setup… then you can try to migrate it if the storage solution isn’t performing like you want it to…

defaults are usually good, or defaults for high io / small files / database like workloads.
because thats sort of what the storagenode is a massive database of customer data pieces which can be demanded randomly at any time.

So I guess I am good. Have 100 Mbps unmetered connection and SAS-2 6Gbps HDDs in RAID 50 with default RAID settings on the hardware RAID of the controller. Should be good enough and nothing to optimize. Was wondering, if it is a good idea to set the machines to low power then, to reduce electricity costs.

the only power management that matters is on the hdd’s themselves, for them it should be disabled as there will always be activity and them falling to sleep only to spin up again can cause excessive load cycles…

usually not something thats an issues and i would think your hardware raid deals with that.

but yeah if you are just running storagenode, there isn’t any point to push performance.
golem is pretty neat for running on servers, if they are powered anyways… will put your compute to work for earnings with that.

OK. Got it.

As to Golem - i know and run some nodes, too. Thanks! We discuss STORJ here, anyway.

16 TB + RAID 50 = 32 TB storage capacity?
How does your business case look like?
Just curious.

i only run this ZFS beasty :smiley:

svet0slav was the one with the raid50

97.5 allocated and 99.8 free? :thinking:

Btw 100 TB ??!!


Stick to the node performance topic, please. :slight_smile:


I prefer the hardware RAID and the RAID 50 seems fail-safe enough and with relatively high capacity compared to other configurations.

1 Like

i wouldn’t worry about it, storage nodes takes literally a year to pass the 1TB range.
for new nodes anything will do, as nodes grow older migrating and making sure they don’t die become more important.


Sure. Thing is I do not plan to migrate them at all. Even wondering what to set to the cache of the RAID controller. 50% Read / 50% Write? Seems they are nearly identical.

there are a ton of high io writes, i might go 20% read / 80% write.
especially if you can change it later… if you cannot change it later i might go 50/50 just because one cannot go wrong with that.

One can change it later, but has to reboot the machine, to go advanced storage administrator and then update it and reboot… Node will be down for like 20 min because of that change. LOL
Set and forget better. Was wondering what is best. Checking other nodes stats to determine by Egress/Ingress.

the megabytes / kilobytes can be deceiving, you will want to look at the io.
for my case atleast my write io is many multiples of what my read io is most of the time.

Not sure how to check IO on hardware RAID. Maybe do some extra math with iotop.

i use zpool iostat -v because thats my zfs options for such things.
never really tried using linux without it, i’m sure iotop is fine, tho you want to be able to see it over time…
i usually run my zpool iostat -v 3600 which is (3600 sec aka hourly) avg
its almost always several multiples or even magnitudes higher for writes compared to reads.

1 Like

I don’t know… Not fond of software RAID. Still wondering whether to use RAID 60 instead of RAID 50, though. Less space, but since no backups - more secure.