Yeah using partitions is pretty sweet! In the special-metadata example you can have a couple SSDs… then carve off pairs of partitions and mirror them for your pools. Like if your first Storj HDD is 10TB… make a pair of 50GB partitions and mirror them as your special-metadata. Next HDD is 18TB? Slice off a pair of 90GB partitions etc.
Using the 5GB/TB guideline… a pair of 1TB SSDs should cover a 24-bay JBOD of 8TB’ish HDDs. Or you get the idea…
So, You are saying - You could take two partitions and make mirrored metadata vdevs out of them, or You first make a mirror of a whole ssd and take slice of a mirror instead?
BTW - enabling L2ARC for my Storj pool seemed to have reduced amount of my server-grade HDD noise.
You’d typically leave both SSDs separate…then use a tool like fdisk/sgdisk to make a partition on each… then use those two partition names in your “zpool add” command and specify them to be mirrored. You don’t need to mirror the entire drives together first.
yep I’ve done this. I had a single large SSD that wasn’t really being utilized much, and now it has about 8 small partitions that are being used as L2ARCH metadata for 8 nodes.
And so just put the databases in their own dataset with a small record size? I am not keen to store small files in the special vdevs. The old hardware I’m thinking of using consists of 8x 3 TB ~5900 RPM disks, 16 GB of memory (I might be able to swap motherboards to one with 32 GB), and I don’t want to spend a lot on SSDs.
Carving off pairs-of-SSD-partitions as special-metadata mirrors for each Storj HDD works very well. I wouldn’t worry about adding small-files support, unless you have lots of spare flash space… as the file metadata is the important part.
I’d just have one separate mount on SSD and bind/redirect all your databases to directories inside it. No need to mirror: since Storj databases are 100% disposable (and are stats-only).
Similar:
I have a single SSD that I have partitioned so I can use multiple L2ARCS for each node hard drive.
And I already relocated the storj databases to a SSD via the built in docker options.
Incidentally, I learned something about my own ZFS setup (8 nodes on 8 drives, and one SSD with a L2ARC for metadata).
when I had less RAM allocated to my NAS software in general, about 20GB total, I had about 10GB ZFS ARC cache and all the drives suffered from consistant monderately high utilization. The SSD with the L2ARC wasn’t really helping out that much.
I allocated more RAM, about 30GB total, so that ZFS ARC was about 20GB. And then utilization across all hard drives went way down, but I could see the SSD with more activity.
if I read arc_summary correctly just the headers for the L2ARC data were taking up about 5GB of RAM, so maybe I was constrained on RAM earlier and something bad was happening.
TL;DR: the ZFS L2ARC may indeed require “enough” RAM allocated to the regular ARC cache in order before L2ARC can help that much.
Storj databases don’t use much, but seem to be all over the place. I 've seen some as high as 1.5GB/used-TB. If I were going to create a partition to handle the dbs for multiple nodes… I’d use 3x TB-raw-space (as GB). So if it was something like 10x10TB HDD… that’s 100TB … so I’d make a 300GB SSD partition to hold those 10 node databases.
It would probably get no more than 25% used… but then I would never think of it again.
Edit: If you’re tight on space… well… those databases only hold stats… so can be erased at any time and won’t affect your payouts (just the numbers in the GUI until the next month begins)
Thank you! Yes, I will be tight on SSD space. I’m mostly using old hardware (8x 3 TB spinning rust), but bought a couple used 150GB DCS3520 drives. Now I see I probably should have gone for slightly bigger drives. I might see if I can throw an NVMe adaptor into the system and find something suitable for databases, since it doesn’t matter if it dies.
Which gets more write IOPS, the special devs or the databases? They’re both effectively databases for the same data, so I’m thinking they should be on the same order of magnitude.
If you are using piecestore and have the extra space and like a decent amount of RAM I would personally recommend doing so. If you are using the new hashstore backend then no, it has no effect.
Now with the new databases being so small, I could make my life simpler by making a separate dataset for each database, then setting special_small_blocks=512 on that dataset, and it should go to the special device SSDs, correct?