Large trash folder

Hi everyone,

My trash folder seems quite large.
All while it seems that the deletes go through normally and GC shouldn’t have much to clean up.

df -sh of the trash folder:
659G trash/

successrate delete:
========== DELETE =============
Failed: 0
Fail Rate: 0.000%
Successful: 116033
Success Rate: 100.000%

does this seem normal?

1 Like

This behavior is normal, although I wouldn’t say the amount of data hitting the trash folder recently is normal. This was due to the stefan-benton sat cleaning up a bunch of old pieces. Under normal circumstances I have found between 50-100GB in the trash at any time (from a 3.3 TB share size), even with a near perfect delete success rate.

Pieces going into the trash via garbage collection isn’t directly tied to delete successes. In fact, I don’t think under normal circumstances that a failed delete would actually result in a piece going to the trash, as those failures are probably due to the piece not being found.

This happened because a lot of zombie segments were cleaned up on the testing satellites, which led to this data being cleaned up by garbage collection instead of normal deletes. Some nodes are seeing up to 2TB in their trash folder right now, but this data will be cleaned up automatically. It’s normal in the sense that nothing is wrong with your node. But it is rather exceptional as this is unlikely to happen to this extent again.

It depends on the failure. Time outs on delete operations could end up with normal deletes going to GC and trash. Additionally if your node is offline during deletes they get caught in GC and go to trash as well.

What I should have said was, deletes that are logged as failed wouldn’t necessarily end up in the GC/trash cycle. Correct me if I am wrong, but you wouldn’t see a delete failed in the logs if your node was offline, right?

Correct, but you could see one if there is a time out. Failing deletes are rare to begin with. But in any case if a delete is not processed as normal and the data stays behind, it will be caught in GC. So that would include errors we may not know about yet.

1 Like

What are these “Zombie segments”? A quick search of the forums mentions them quite a lot but not really what they are. Simply “segments that are no longer accessible”.
What does this actually mean?
How do they come about?
Enquiring minds want to know :wink:

1 Like

I’m afraid the answer is kind of similar to how trash happens. Zombie segments are segments of data that stay behind on nodes and satellites that don’t belong to an actual file anymore. This can happen when an upload fails after successfully uploading several segments, but not all of them. But there are probably other scenarios that could cause this as well. The satellite runs a zombie reaper to find and get rid of those zombies and after that nodes clean them up using garbage collection.

This may be an incomplete or slightly incorrect explanation, but I think it covers the basics.

1 Like

Good to know i’m not the only one seeing this.

I assume that SNO’s still get paid for these segments called “zombie” since storage space was used?

I think so, but I’m not entirely sure. The satellite is still aware of them otherwise they would have been caught in GC ages ago. So that probably means you’re paid for them as well, though they’ll obviously never be downloaded.

still visible : afbeelding