@snorkel @jammerdan @Mad_Max
We read a header of the each piece, where Metadata is stored, not only size and update time from the filesystem’s Metadata, so they are different metadata. We use system functions though too.
We didn’t implement anything special to evade the system cache, just Windows is very bad in caching unlike Linux. The 4kB block size is how the system function implemented, we didn’t invent it, and currently doesn’t have options to change it. In Linux it’s likely implemented differently and depends on the filesystem.
Your node will check-in on the satellite every hour by default and will send its allocated and used space, also signed orders to the satellite, the satellite will send the accounted used space accordingly signed orders. With some schedule each satellite will send a bloom filter to nodes to move the garbage to the trash.
The node’s task is to pass a filewalker and move the garbage to the trash accordingly this bloom filter. Then the retain process will remove the expired data from the trash.
If any of the filewalkers will not finish its work, data in the databases will be wrong.
The known reason for failing to process a filewalker is a slow disk subsystem. The reasons for slow disk subsystem are:
- fragmentation for NTFS,
- using a single drive metadata-hungry filesystems like BTRFS or zfs without SSD tier cache or at least huge RAM cache,
- slow SMR HDD,
- bad USB controllers/cables
- using network filesystems
- using VM
- etc.
So, until the slowness of the disk subsystem would not be fixed, the problem will remain.
We used databases too, and when the database got corrupted (and our Node Operators are very professional in doing so ), the node were disqualified, even if the actual data is intact. So, I wouldn’t suggest to return to databases, at least not to a single-file databases like SQLite.
I think you can do this
$ storagenode setup --help | grep write-buf
--filestore.write-buffer-size memory.Size in-memory buffer for uploads (default 128.0 KiB)