If I understood the concept correctly, hashstore uses container files to store the customer data. Those files can be up to 1 GB in size. Is there any mechanism to avoid fragmentation when inflating/deflating files? Wouldn’t it be better to create 1GB files upfront to fill them later?
Hashstore files are tracked to know how much data has been deleted from them… and once the percentage is high enough they get compacted/rewritten. So I believe fragmentation is managed by having those compaction events… as a chance to rewrite large slabs of active data sequentially.
These files are written sequentially and never reused for new files. On compaction, a new container file is created and data is copied from the old one, then the old one is removed. Fragmentation should not be a problem on any modern file system/operating system.
But you have to leave some free space on the drive, in order that the fs/os can play with those 1GB containers and allocate continuous space to each. I wonder how each fs manages a 90% filled drive on hashstore and if ext4 is still the king regarding fragmentation?
Average segment size is under 4k. Default allocation size on ext4 is 4k. Therefore, fragmentation is irrelevant, literally. On zfs situation is even better.
Stop worrying about fragmentation. It does not matter.