i think atleast two things are happening here or might beā¦
the space which was considered used might be considered emptyā¦
and
storjās files seems to grow in size over timeā¦ and thus might be rewritten.
zfs is designed for everything to happen on the fly, so in most cases one shouldnāt need to take the pool down, so long as it isnāt in a catastrophic stateā¦
itās also possible that zfs might rewrite the files if it is working with them anywaysā¦
zfs does all kinds of stuff that is very difficult to explain.
another thing is that storj data gets deleted, so these files will also be in many cases mostly empty space which freed up and when new files are written with compression on, they will take much less spaceā¦
not because the files are being compressed, but because the empty space in a file isnāt written on the disk, because that can be compressedā¦
i usually run ZLE for storj dataā¦ but doesnāt seem to be a major difference so long as one doesnāt go crazy and use like pkzip-9 or whatever the parameter is.
ZLE will have a lowest foot print on the cpu, but LZ4 and ZSTD-1 will have the lowest memory foot print.
i think the default compression in ZFS today is ZSTD-3, but i donāt have a ton of cpu to work with so i run ZSTD-1 because the numbers make sense, its about the same workload as LZ4 and about the same performance aside from in some cases x5 better compression.
ZLE is Zero Length Encodingā¦ basically it writes 0x64 instead of 00000000000000000000000000000000000000000000000000000000000000ā¦
to put it plainly.
one can really get into the weeds on this stuffā¦ LZ4 also has a very interesting way of operating, which is why it can compress so fast, but isnāt super good at itā¦ but its a great trade off between work vs effective compression and only recently got replaced by the better compression scheme.
there is quickly diminishing returns on compression.
but apparently it also works the other way, one can save so much initially that not using compression is a mistake.
It might have to do with the way transfersh works. It uploads data with an expiration date. So you might get 210 GB new data and at the same time 200 GB from a few weeks ago expires and gets deleted. Expired data is not moved into trash.