Node tanking Unraid parity check speeds

fromnack · January 4, 2023, 8:26pm

Hi there

I’ve been running my node on Unraid for almost two years now, running parity checks with no issues during that time. But over the last few days during this parity check the speeds of the check have been under 10MB/s, as part of the usual troubleshooting steps I stopped all docker containers and the speeds went back to the usual 130+MB/s, then proceeded to start my containers back up and see when the speeds dropped again. Unfortunately it was the node which is dropping it down, I tested taking it down and putting it back up again and everytime the speeds up/down. I can’t see that there is much IO happening on the array, certainly not enough to account for over 120MB/s drop in the parity check speed.

Does anyone have experience with this or any ideas for what I can try? I have an 18tb parity drive which takes over 36 hours to check, so taking the node offline to complete the parity check isn’t viable to have a healthy node.

Thanks

Pac · January 4, 2023, 9:24pm

No sure about Unraid, I’m not familiar with it.

But nodes do run the filewalker regularly (and every time you or the auto-updater restart them) which is very intensive IO-wise.
I’m surprised you don’t see a high IO activity after restarting your node? Unless you disabled it (the filewalker), which is a pretty recent feature.

During the filewalker process, IOPS literally hammer the drive and its performance can be highly degraded.

fromnack · January 4, 2023, 9:35pm

I’ve not done any manual changes and I’ve never heard of it, so I reckon it’s there.

I’ve been doing lots of testing stopping and starting the container, so maybe I’ll leave it on overnight to finish whatever it’s doing so hopefully the performance returns to normal.

Thanks!

Morcin42 · January 5, 2023, 8:38am

This is an unraid issue, not a storj issue. The way unraid works, means that it’s fast when you write/read stuff from cache. It’s also close to single disk performance if you are writing to a SINGLE disk. If you are however running multiple operations against multiple disk, performance will go off a cliff. You can mitigate this by changing a few settings, this will make it a bit better but it will not completely fix the issue. Unraid is just not meant for persistent read/write operations; especially while running a parity check.

In unraid, go to settings → Disk settings, there the following three options are important:

Tunable (md_num_stripes)
Tunable (md_sync_limit)
Tunable (md_write_method)

By default, md_num_stripes is optimized for 512MB of RAM. My server has 32GB of ram, so I increased this value to 8192. If you don’t have so much RAM, you can try 4096

md_sync_limit alone will probably give you the biggest boost to parity check speed. Raise it to 80. This means that 80% of IO is dedicated to the parity check. Your docker containers (including storj) will be slower; but will still work. Play around a bit with this setting until you have found something that suits you. I have found that 80 works for me.

md_write_method should be set to reconstruct write, also called turbo write on the unraid forum. It will spin up all drives, so your server might use a bit more energy, but it will be significantly faster in most cases.

fromnack · January 5, 2023, 2:01pm

Yes!! Thank you so much for your response, I’d spent so many hours trying different things; but this worked perfectly! I have a parity check running at 140MB/s with the node online, I’m glad that I decided to ask for help on this as I was considering exiting altogether and calling it quits!

Thanks again for your help, really appreciate it!