Are you sure this is the right file system? It seems to have only 155k files, whereas we’d expect a 14TB worth of blobs to have tens of millions of files. Besides, it reports only 21 GB of data being written. Given last mounted, I suspect this is a system partition, not a data partition.
I believe mount command gave me the DSM partition, md0. I will try md1.
The dumpe2fs I managed to install from synocommunity packages, if anyone wonders. There is a DiskCLI disk tools package.
If I put a wrong/non-existent device name, is there any problem? Can it crash something?
I can’t find it. I ls the dev directory, and I tryed all that resembled a partition, mdx, sgx, sata1px… I get the same output from md0, sata1p1, sata2p1. Others gave me errors. I don’t see any sda, sdb etc. Maybe Synology hides the data partitions or something.
(300 bytes covers an inode and a direntry). If the box has 18 GB of RAM, then I’d assume Synology itself also needs some of that RAM, so it’s crossing the limit.
I don’t see any other problematic settings there except for lack of the dir_index feature, but after googling a bit this seems to be a Synology-specific thing. Weird, but should not affect the used-space file walker.
Still weird that when this box was only 9.5TB, it was that slow.
I believe I was using those 2 parameters with 4MiB, that cache the incoming pieces. I saw an increase in buffers, that took like half of RAM. Now I put them to default. One is 128KiB and the other still 4MiB; now buffers occuppy like 10% of RAM. Those test were done 1 year ago.
Why dosen’t FileWalker just reads the file system and inodes?
Every file (piece) has an entry there, right? If the entry is missing, than the file is lost. If the entry is there but the file is missing/ corrupted/ replaced by empty file or anything else, than it will fail audits anyway.
So if we already have the audits in place, why do we need another service to check the same thing?
It’s not for the audit purpose I suppose. It’s to report the correct used space.
Please note, the satellites operates with segments, not nodes, so they need to have a confirmation that there is a free space on the network to upload a piece of each segment (but it’s a separate process in the node selection).
Perhaps, if it would be more monolithic and have an information about the whole picture, it could be accounted… but in expense of the response time…
And this is where we want to have improvements.
Right now it’s implemented this way - to offload the load of calculating the available space to the nodes. As a result the node selection is fast and simple, so the customer is not forced to wait for the reports from all 110 requested nodes during upload for each segment.
Of course there is a trade-off if the node is overloaded or doesn’t have a free space actually… right now it’s accepted… But I would expect improvements in this regard.