Is the vacuum necessary?

SGC · May 23, 2020, 2:21pm

@alexey i suppose this one shoud also go in the is vacuum necessary?
because i don’t understand why something is needed doesn’t mean i don’t need it…
i got an open mind which is why i actually bother starting to research the topic of vacuum since people keep talking about it in relation to the db locked issues… which i’ve also been seeing alot at times.

maybe i need to vacuum, maybe i don’t… if nothing else it’s an interesting concept i like to understand a bit more about… ignorance is a dangerous thing, but then again so is a little knowledge lol so yeah…

i duno sorry, pretty sure if i did vacuum zfs would do it in ram maybe not 100% but enough that making a ramdisk for it wouldn’t make sense.

if your vacuum is slow, don’t you think you might be running to big blocksizes and run into an actual caching issue, it’s 500mb shouldn’t be anything any modern computer couldn’t handle with ease, so long as it’s configured to use regular size blocks on drives.

in this example a workload related to something with databases, i forget what… he runs into a data amplification of x40 because of his blocksize and when caching isn’t working correctly his workload ends up being basically tons of extra disk reads…

example from the link below

The sum of read bandwidth column is (almost) exactly the physical size of the file on the disk ( du output) for the dataset with primarycache=all : 2.44GB.
For the other dataset, with primarycache=metadata , the sum of the read bandwidth column is …wait for it… 77.95GB.
A FreeBSD user has posted an interesting explanation to this puzzling behavior in FreeBSD forums:

clamscan reads a file, gets 4k (pagesize?) of data and processes it, then it reads the next 4k, etc.

ZFS, however, cannot read just 4k. It reads 128k (recordsize) by default. Since there is no cache (you’ve turned it off) the rest of the data is thrown away.

128k / 4k = 32
32 x 2.44GB = 78.08GB