i think atleast two things are happening here or might beā¦
the space which was considered used might be considered emptyā¦
and
storjās files seems to grow in size over timeā¦ and thus might be rewritten.
zfs is designed for everything to happen on the fly, so in most cases one shouldnāt need to take the pool down, so long as it isnāt in a catastrophic stateā¦
itās also possible that zfs might rewrite the files if it is working with them anywaysā¦
zfs does all kinds of stuff that is very difficult to explain.
another thing is that storj data gets deleted, so these files will also be in many cases mostly empty space which freed up and when new files are written with compression on, they will take much less spaceā¦
not because the files are being compressed, but because the empty space in a file isnāt written on the disk, because that can be compressedā¦
i usually run ZLE for storj dataā¦ but doesnāt seem to be a major difference so long as one doesnāt go crazy and use like pkzip-9 or whatever the parameter is.
ZLE will have a lowest foot print on the cpu, but LZ4 and ZSTD-1 will have the lowest memory foot print.
i think the default compression in ZFS today is ZSTD-3, but i donāt have a ton of cpu to work with so i run ZSTD-1 because the numbers make sense, its about the same workload as LZ4 and about the same performance aside from in some cases x5 better compression.
ZLE is Zero Length Encodingā¦ basically it writes 0x64 instead of 00000000000000000000000000000000000000000000000000000000000000ā¦
to put it plainly.
one can really get into the weeds on this stuffā¦ LZ4 also has a very interesting way of operating, which is why it can compress so fast, but isnāt super good at itā¦ but its a great trade off between work vs effective compression and only recently got replaced by the better compression scheme.
there is quickly diminishing returns on compression.
but apparently it also works the other way, one can save so much initially that not using compression is a mistake.
It might have to do with the way transfersh works. It uploads data with an expiration date. So you might get 210 GB new data and at the same time 200 GB from a few weeks ago expires and gets deleted. Expired data is not moved into trash.
Hi, guys. Iāve been google around about the same ropic and noticed that pools default write size is 128kb. Many storj blob files are smaller then that. So Iāve modified that to 4kb and observing currently.
ZFS wonāt actually take up the entire recordsize for something smaller than a single record, but it will take up a multiple of record size for anything larger than a single record. i.e. with 128k record size, a 10k file will take up 10k, but a 130k file will take up 256k. So itās correct to say that a smaller record size can help. However, you can also lift the āround up to next recordsize multipleā limitation by enabling any form of compression.
@mattventura - that was comprehensive explanation.
Interesting. Iāve had compression enabled from the very beginning (LZ4 is the default option in TrueNas Scale 24.10) - at the moment STORJ node shows 670GB filled space, but actually place taken is 777GB. So now difference is 107GB.
Yesterday, before I decreased write size from 128KB to 4KB - difference was 110GB.
Until then - difference was increasing, but i have no evidence to prove it.
Iām beginner with TrueNas, just my observations. If the difference in occupied data is not due ZFS - what else could it be then? What amount of difference is considered OK?
regarding the initial topic, I also enable compression on my zfs disks. the compression ratio on a storj drive is only 1.08 ,so % isnāt too exciting but itās nice to know space isnāt being wasted.
@foegra there are a variety of ways that the storj dashboard, both by design or bugs . a go - to option is to restart the node and give it hours or even days (if multi terabyte) to finish running a used space filewalker to account for all the pieces and update the dashboard.
Will see how it goes.
In any case - 1.08 is better then 1
Is there an option to force the node to go through the files check itās content, integrity etcā¦?
No. But the restart would calculate the usage and update the databases.
The satellites are auditing your node. So, if something corrupted or missing, your audits score would go down.
Does it remove unnecessary files as well?
After restart my node size went down from 630gb to 580gb, but actual used space on hard drive did not become smaller.
In my case probably problem started when Iāve moved existing STORJ node while running from one disk to another by replacing disks from TrueNas Scale menu. I have restarted the node from scratch, otherwise every time i restarted the node, node size went back to 580GB.
All right, i see.
Then i should not have recreated the whole node from scratch. I confirm - rebooted the node and used space went down to the value it was before previous reboot.
Question - is this only UI issue or it affects the payment calculation as well?
itās not lost, the stat of usage become broken, doesnāt affect the actual usage.
Itās UI, but if it would think that your node is full accordingly this stat, it would stop receive any new data, even if actually it has a free space.