Debugging space usage discrepancies

The most interesting fields (IMHO) are num deleted (the number of deleted pieces) and Created Before.

Bloom filter can decide if one piece should be deleted or not (it includes the essence of the satellite knowledge :wink: ).

But this knowledge is always out-of-date. What happens if bloom filter is generated at the morning, but received by the storagenode at the same evening? All new files (created since the morning) are looks like unknown, as Satellite didn’t have any chance to know about them.

Should we delete them? Certainly not.

For this reason the bloom filter includes the validate date. (the date of the newest segment at the time of the creation of the bloom filter). Storagenode should check only the older files. And keep all the new files.

But the clock of the Storagenodes are not very precious either. Even if NTP is recommended, some nodes are way behind the current time. Which makes the date based exclusion less reliable.

For this reason additional 3 days are substracted from the current date.

Logged Current Date = newest segment during bloom filter creation - 72 hours

Storagenode will delete only older segments. Bloom filter is generated from a backup database (to avoid any additional load on the production database). For this reason, even after a successful GC file walker run, you may have 3-7 days old segments on the local disks…