RAID controller or HDD slow

Vadim · July 10, 2024, 8:07am

Does someone know how to understand is it RAID controller slow or HDD.
I have RS2WC080 controller with 8 HDD 4tb perple.
They all the time 100% workload all the time. each hdd is 1 node
I see ingress is only 30-60Mbit for all of them together. So it is not very big trafik itself for hdd.

controller looks like PCIE 2.0 yet

Ottetal · July 10, 2024, 8:29am

The 100% activity counter means that the drive was active and did work the for the entirety of the measurement period. As far as I remember, that period is the last 10 seconds on most systems. The throughput of StorJ is not very impressive, but the IO load certainly is.

Let’s take the worst case: Your RAID card is PCIe 2.0 in 8x mode. That still gives it a theoretical maximum of 4.000 GB/s throughput - a figure which you would not be near even if all drives did a constant 300MBps. My money is on IO performance, which HDDs are very limited by due to the physically moving actuator heads.

To make performance better, there is all the “normal” things to do, which includes:

Moving databases to SSDs
Changing log type to warn
Adding additional RAM to your system
Putting a cache in front of your disk arrays

Vadim · July 10, 2024, 8:49am

Moving databases to SSDs-from beginning made
Changing log type to warn- from begining made
Adding additional RAM to your system - 16GB ram system using 8-9

what is wonder responce time is very big, on other hdd in other pc also 4tb same model hdd i have 3-4ms

Roxor · July 10, 2024, 10:18am

That’s probably normal: used-space-filewalker on a 5400RPM HDD would likely take a day-per-TB to run?

snorkel · July 10, 2024, 10:34am

If the same drive works better in SATA, instead of controller, than the controller/pcie is to blame. Stop all nodes, and run only one. See if makes any difference.

Vadim · July 10, 2024, 10:47am

Thank you for Idea, not really helped, so it is HDD slow.

EasyRhino · July 10, 2024, 6:12pm

If the drive is doing any sort of filewalker, it tends to peg a drive at 100%. By default the used space filewalker starts when you run a node, and could take hours (or days if the disk is large and/or fragmented). a 8TB ext4 drive took me about 8 days to complete.

The controller, while ancient, is probably okay. The indication of a bottleneck there would be if activity on one or two drives caused others to slow down as well.

Ruskiem · July 10, 2024, 7:09pm

not necessarily HDD’s fault.

Also tools (like HD Tune PRO) does not show us, SNOs,
if the Windows is making good use out of the HDD.
HDD might look good in speed tests, but what windows will do, its a whole different story:

From my various tests i saw a 16TB disk, full of storj’s files, 53 months old, never defragmented, to behave slow with small files on 1 PC, and on another PC it was suddenly fast (speed test and access times under HD Tune pro always the same good - does not indicate nothing). And that fast result- was achieved with a connection via cheap pci 1x → 6 sata adapter!) All that tests brought me to a conclusion, that windows can F’ things up, by its own, from thin air, if he likes.

Do a blob folder properties test under windows.
If it can count files like 10 000/s then its fast and such settings has no problem with filewalking with storjs app. If it’s 100/s its slow as F’ and filewalking will take days or even 2 weeks. Perform “analyze” from window’s Drive optimization tool. That can unleash disks speed like a wizards magic wand (no need to press “optimize”, just “analyze” and sometimes it don’t even need to be finished, sometimes i noticed it helps from around ~67%, but its short operation anyways, so just let him finish)
best results with 8dot3name turned off, and other goodies, like last accessed time turned off You know all that. On bare metal the magical “analyze” works up to the next disk dismount (by restart, or by adding to a VM). Best works if the disk was formatted under the windows version, on which it works. Just do the same storj’s blob folder “right click → properties” test. if it counts files same slow, restart the pc, and do the analyze again. If the disk wasn’t formatted in this very windows instance (version like Win10 v.s. Win7), it can not help much. So far, it always works if the disk was formatted in the very windows instance You want it to stay and work.

if You are curious, You can check disk with CMD
fsutil fsinfo ntfsinfo d: (or whatever letter Your drive is)

LFS

Should be 2.0.
its the newest it can be.
If LFS ver is 1.1 , then it means its downgraded for compatibility with windows 7 and older.
You can also enable the “NtfsDisableLfsDowngrade” option in windows registry, to prevent that, just don’t know how much it will listen to You.
@Vadim taggin’ , bcoz i edited so much.

Toyoo · July 10, 2024, 8:21pm

I do have a PCIe gen 2 x8 controller in my box, and it’s not a bottleneck for HDDs. (It is for SSDs though.)

Alexey · July 11, 2024, 3:03am

This describes the cache for metadata. When you analyze, it will cache MFT. So, if it would be used right after that it will not allow the OS to invalidate the cache.

Ruskiem · July 11, 2024, 7:18am

not sure, MFT is in GB’s (80GB for 16TB disk), i haven’t notice any gains in RAM memory after that.

daki82 · July 11, 2024, 8:15pm

this is the new normal. ttl deletes, filewalks, gc.

i have my 20 tb disk never seen going lower than 95% and its RW cached via primocache. 500gb read nvme and 1gb ram writecache.

EasyRhino · July 11, 2024, 11:29pm

you have a terabyte of ram?

daki82 · July 12, 2024, 7:04am

No. Its gb. My bad.
Edited. It was in a hurry.