Disk usage discrepancy?

Alexey · December 22, 2023, 5:06am

At least three:

gc-filewalker
lazyfilewalker
retain

Also there are scan chores (they are technically not filewalkers, but do scans too):

collector
piece:trash

Each of them perform only one task, but the results are used by the next filewalker (gc-filewalker will collect the garbage (using bloom filter received from the satellite), retain will move the garbage to the trash, collector will remove the expired data, lazyfilewalker caches and sends information about used space to the satellites), piece:trash deletes data older than 7 days.
You need to check that each of them has been started for each trusted satellite, then successfully completed without errors.

each configured differently. gc-filewalker runs at least once a week (it’s depends on the satellite - each sends a bloom filter with their cadence), lazyfilewalker on each start and then renews cache every hour by default, the collector runs every hour by default, see

storagenode setup --help | grep interval

      --collector.interval duration                              how frequently expired pieces are collected (default 1h0m0s)
      --storage2.cache-sync-interval duration                    how often the space used cache is synced to persistent storage (default 1h0m0s)

Some depends on the satellite (gc-filewalker), some depends on the last run (retain runs weekly), some hardcoded (piece:trash runs every 24h).

They should finish before the next restart. The restart could happen if your node get an update (roughly every 2 weeks) or if it’s crashed because of a FATAL error.