Node raid-5 ssd cache

Dunc4n1d4h0 · January 30, 2023, 3:00pm

Hi,
At the moment, a software mdadm raid-5 is being built on 4 Ironwolf 8TB drives. Node is temporarily running on a single 10TB drive.
I think this is the best time to ask the following questions and count on advanced operators to share their experience

I have 2 128GB SSD drives lying idle and thought of creating a cache. I don’t know what solution makes the most sense for node. Bcache or lvm cache or whatever and if it is worth the effort.
Maybe make a raid-0 of these 2 SSDs and cache on it?
I also wonder what the wear level of the 128GB SSD will be at the end of it moving almost 10TB on the raid on initial rsync.

BrightSilence · January 30, 2023, 3:11pm

Make it a RAID-1. Yes, the SSDs will wear out fast, but Storj benefits most from read/write caching. Read only caching has limited benefits. But with write caching you will want the redundancy, otherwise you’ll lose your node if a single SSD fails. Which… is kind of only a matter of time. If you don’t want to use up the drives in a few years, definitely try how it performs without cache first. It’s a costly addition and usually not needed.

Also, I’d like to go one step further and suggest you run separate nodes on independent drives. The performance will be much better because all operations only have to hit a single drive. In that setup you definitely won’t need the SSD cache. Of course this advise only applies if you don’t intend to use these HDD’s for anything else.

Dunc4n1d4h0 · January 30, 2023, 3:25pm

My default idea a certainly a good and safe option was not to use cache and SSDs at all. At this point for me, data security is very important, node has been running since 2020 and has accumulated almost 10TB. Hence the raid-5, simple and secure, of course, except for cases like the death of 2 drives at once.
The idea for more than 20TB of space is to do lvm gv and lvs on top of md and use some of the space for other things, filling an additional 10TB by node is rather a matter of years.

BrightSilence · January 30, 2023, 4:11pm

Or the very often overlooked human error. Which I recon has a much higher chance of happening than 2 simultaneous drive failures.
I started like you with one big node on a large array (though this was an existing multi use array with spare space, so not entirely the same). But I’ve since expended with separate disks, which now make up more than half of the stored data/income. It would still suck to lose my big node, for sure. But the risks are spread and losing any of my other nodes is not a big deal. At the same time I can now share every last byte with Storj without needing redundancy and performance on the single disk nodes is fine without SSD cache (which can’t be said for my large node on the large array).

Edit: I missed this part

I’d say that’s a legitimate reason to use an array. Hopefully your array will be fast enough to not need an SSD. For what it’s worth, I’m using SHR-2 (Synology’s implementation of lvm on stacked md-raid RAID6 arrays) which is a lot slower with writes than RAID5 would be.

Dunc4n1d4h0 · February 8, 2023, 2:06pm

Update…

Raid-5 works like a raid. Nothing unusual.
On the other hand, I was annoyed by the noise level when the drive heads were working and added a 500GB nvme drive as a lvm cache.
The difference is huge, iowait dropped several times, the sound level also.
The cache has been running for a few days, so it’s hard to have final statistics, for now:

- Cache Usage: 24.9% - Metadata Usage: 2.4%
- Read Hit Rate: 65.0% - Write Hit Rate: 36.9%

You will have to wait for a visible difference in the work of the storagenode itself in the dashboard of course

Solu · February 10, 2023, 6:39pm

Huge difference also after node restarts / updates…

Node restart at 16:44pm, filewalker active till 18:40pm with nearly constantly 80% hitrate on my L2 read cache. (Node size 16TB)