well its also quite a large range… 4k to 64k is 10x increase
that would sort of reduce your IO by that same factor.
ZFS runs 128k on 8k actual written on disk, atleast in ****ing proxmox… tho i’m told that writing in anything less than 64k on 4k spec drives, decreases their performance.
most likely due to the modern use of larger files rather than many small
ofc the space wasted can become larger the bigger blocks you write at a time, i’m currently running 256k recordsize in zfs on 8k, 256k is one of the block sizes that fits best into the max storj data file, only wasting a few %
2319872 bytes pr file / 1024 =2265,5k meaning storj picked a weird size
but if you imagine it written out in blocks and then see how filled the last block will be… thats how much waste you get.
the bigger the blocks the more waste can be created and the more bandwidth is used for writing like say 1kb to the drive, if every block must be 64k or 256k
and then there is the previously mentioned io considerations… i wouldn’t stay on 4k blocks… if memory serves 32k is also pretty good, but you should really check… maybe i should make a list…
on at almost every file is so much so that you barely need to check how much space you’ve used, just count the number of files and multiply by that.
the larger the blocks get the less IO advantage you gain, because you will always be required to use a minimum amount pr file, so just going up to maybe 16k could be a great idea… tho then we have to make sure that 16k isn’t terrible offset with the filesize
2265,5kb file / 16k = 141.593
the number is what is interesting and the last half of a whole number… first off each block would be less than 1% of the file, because there are 141 of them, so you would never get a bigger deviation than 1% in added space usage.
the .593 is what is written in the last block… meaning 40.7% of that block is empty
and you would have 141 parts of the file the hardware would have to figure out how to get, which can put a stain on a bad system, but most are pretty smart about it today. like say if it knows you want all those pieces, then it might just get them all at once if they are written sequentially on disk.
but can and will in some cases cause additional IO with smaller blocks, while you would waste space with larger blocks…
if you have a blocksize of 512k, then it writes the whole file in 4,42 blocks… so 4 are full and the 5th or 20% of the total space used is only less than half filled meaning something like 10% of your disk space would be wasted, but you would be down near or beyond the minimum possible IO to move the file… no matter the blocksize
ofc in this case, then if your system wants to change a bit in your database on the drive… it will go read the block into memory (i think and then rewrite the entire block) meaning that 512k vs 4k could in some cases require 128 times the memory bandwidth, it would also allocate 128x the memory for such tasks even if its all empty, and the drive would be writing zeroes for 128x longer.
so it very much becomes a trade off… stick around the recommended defaults for whatever system you are working with, within sort of big database spec, but adjust the block sizes depending on if you need more IO (working with a single disk) or if you want basically minimal IO at other costs
16k looks pretty good less than 1% wasted space… thats pretty neat… 32k is also pretty solid
above 64k i wouldn’t recommend the costs most likely outweigh the benefits… tho zfs runs a default recordsize of 128k but i don’t think thats the regular block sizes, because it has a vol block size which for me is 8k which limits my throughput…
so yeah i would recommend
16k or 32k blocksize
tho i wouldn’t call myself an expert on the topic, i sure would like to get rid of my 8k vol block size on my zfs pool… people say i should go 64k for throughput… but pretty sure ill go with 16k or 32k as to be able to run regular things on the pool also…