Zfs dataset recordsize=? for storagenode

ndragun · November 25, 2019, 8:40pm

“Recordsize” is going to be more of an impact on disk IOP’s, not your ability to efficiently store data. Since STORJ data is stored at the object level, not block level, file sizes are variable and there is no way to adjust for the maximum file size. I’ve seen all kinds of wide ranges of stored file sizes on the STORJ v3 network, truthfully I wouldn’t worry about it. ZFS will do it’s best to “pack” as many files into each record size as possible on it’s own. Unless you’re punching through LOTS of disk access requests per second, per disk (like 100’s), and know what you’re doing to optimize it you’re probably not going to see any benefits adjusting this value.

Furthermore, since it’s encrypted you won’t see much compression benefit, nor dedpulication benefits, by using ZFS. You also lose about 20% of space, on top of lost space due to parity disks, by using ZFS due to the crazy performance hits once you go over 80% full capacity. So:

Config 1: 4x 2TB disks /w raidz2
Total raw capacity: 8TB
Parity disks used: 2
Parity capacity lost: 4TB
Overhead lost @ 20%: 800GB~
Remaining usable: 3.2TB

Config 2: 5x 2TB disks /w raidz2
Total raw capacity: 10TB
Parity disks used: 2
Parity capacity lost: 4TB
Overhead lost @ 20%: 1.2TB~
Remaining usable: 4.8TB

Doesn’t look so pretty.