Thanks for the explanation, that clarifies things a little, but I’m still puzzled.
What I have understood so far (correct me if I’m wrong):
- The satellites send deletion requests and if the node is online, it likely receives them and deletes the pieces directly (so they don’t show up in trash) and adds a line to the log.
- If the node is offline, or doesn’t receive the message for some other reason, it doesn’t delete them at this time. It’s also possible that during this time some pieces will be repaired and sent to other nodes, but this node can’t know about it because it’s offline, so it won’t delete its own copies, yet.
- The satellites periodically send bloom filters that show if a node should (probably) have a piece or not. So, sometime in the future, some of the pieces that weren’t deleted when the node was offline will be picked up and moved to trash, without the node logging anything (with default config). They will finally be deleted when they expire.
That makes me think that trash is caused by deletions (either by the customer, or by the satellite “moving pieces away” from a node that is down when repairing). It’s just deletions that were not caught in time and were collected (much) later. It is also unlikely for the pieces in trash to be restored.
In my particular case, the node was down for ~1 day in late February (power cut) and then ~10 hours around a week ago, so having a lot of trash (instead of deletions) makes sense. My other (older) node didn’t have much downtime and so has very little trash, but should have a lot of (direct) deletions. To verify, I wrote a small script that goes through the log and calculates deletions and other stats.
Edit: link to script source (github)
The output:
## Deletes (weekly)
Date UTC | OK count + Unseen | % + % | Size GB
-----------|-------------------|--------|--------
2021-01-21 | 24636 + 11852 | 67 +32 | 6.20
2021-01-28 | 27277 + 10977 | 71 +28 | 8.75
2021-02-04 | 23415 + 8153 | 74 +25 | 7.31
2021-02-11 | 30185 + 418 | 98 + 1 | 6.72
2021-02-18 | 85063 + 2617 | 97 + 2 | 36.43
2021-02-25 | 84841 + 2124 | 97 + 2 | 33.39
2021-03-04 | 134297 + 2143 | 98 + 1 | 32.82
2021-03-11 | 186182 + 3138 | 98 + 1 | 44.56
2021-03-18 | 107672 + 1860 | 98 + 1 | 27.62
The last column show a significant rise in deletes over the last 4-5 weeks.
What are your thoughts ?
Anyone has delete stats to compare (I’ve seen some SNOs posting disk graphs on the BW comparison thread) ?