Some ideas for improvements for trash collection

Toyoo · January 13, 2024, 5:26pm

I see that large GC runs are hitting my nodes, and roughly 5% of pieces stored went to trash. Trash file walker is run (by default) every day, and so for the next week my nodes will have to walk through millions of pieces. My setup is good enough for file walkers to not cause any issues, but knowing that not all nodes are like that, I thought some improvements might be of use.

There is a direct interaction between GC and trash: if I understand correctly, pieces land in the trash directory only because of GC, and they wait 7 days there. So maybe there’s no point in running trash file walker that often, and run it not on a daily basis, but directly before a GC run. We would still keep the one run after a node restart in case node missed a bloom filter.

Alternatively, given a single trash file walker run scans inodes of all trashed pieces, it could collect the age of oldest piece still waiting for removal, and skip a day if it’s known there are no pieces to collect. This age wouldn’t even have to be put in permanent storage due to the initial scan after node start.

What do you think?

snorkel · January 13, 2024, 6:04pm

I don’t know if it’s related, but seeing the discussions about used space discrepancies and how they’re produced by the limitations of bloom filters, isn’t there a way to impruve the deletions, along with trash service?
I don’t realy know or I forgot how the “move to trash” happens, but I recall that there are cases like:

client deletes the file on the spot;
client sets a time for the file to be deleted;
client’s account gets closed for whatever reason and files get deleted.
there are too many pieces of a file and the extra pieces get deleted.
I can’t think of other cases. When I say deleted, I mean marked for deletion, sent to trash and be removed after 7 days. All these are dependent on satellites and their databases.
And the discrepancie emergies from the fact that satellites record the pieces as sent to trash, but the nodes don’t get the command, don’t send the pieces to trash and they see them in stored data.

Isn’t a way for clients to send the delete command directly to nodes, like the uploads and downloads, instead of relaying on satellite’s database? At least for some part of the pieces like spot deletes and timer deletes?
Also, the trash service/GC or how’s it called the elf responsable, could querry directly the satellite’s db once a day and ask for what pieces should be sent to trash?

mgonzalezm · January 13, 2024, 6:20pm

expired pieces are deleted directly, those don’t go to the trash directory.

That was the way that were used to delete files before, but it caused problems (was too slow) when the client delete big buckets with lots of files, so they changed it.

Toyoo · January 13, 2024, 6:24pm

Currently GC collects it once a week. It used to be that customer would send a deletion request to the satellite, which would then identify nodes holding pieces of a file and send them deletion requests immediately, but this mechanism was turned off. I don’t know why, I am guessing of the number of requests that had to be issued was too high.

Removed by a separate storage node chore. Does not require a file walker, pieces to expire are stored in an SQLite database, so it’s enough to scan this database. Pieces are removed immediately, they are not moved to trash. This is fast. In case the node operator lost/removed the database, then GC will pick up those pieces.

Right now it would be GC. I don’t know how was this case handled in the past.

I’m assuming a scenario in which the client managed to upload too many pieces: GC. Neither the satellite nor the client knows about those pieces, so GC is the only way here.

I don’t know if this case can happen in other scenarios.

No. The client does not know which nodes store pieces, it would have to query the satellite.

So the node would send a list of millions of pieces to the satellite? Impractical.

snorkel · January 13, 2024, 6:41pm

When you go offline more than 4 hours, your queried pieces aren’t available, and they replicate them on other nodes, so when you get back online, the network has extrapieces, yours, that need to be removed. I think I got this right .

Toyoo · January 13, 2024, 8:03pm

Right! Yeah, nobody cares about remembering that your node stores these pieces, so they’ll be collected by GC.

Alexey · January 15, 2024, 4:10am

You may try to restart the node exactly before the next GC run (you can know its date from the logs).

Toyoo · January 15, 2024, 9:51am

Why?؜؜؜؜؜؜؜؜؜؜؜؜؜؜؜؜؜؜؜؜؜؜؜؜؜؜؜؜؜؜

Alexey · January 16, 2024, 7:42am

Most of filewalkers runs periodically, starting in needed time you would shift the point, when the next loop would be started, and you likely get the proposed result - the retain process will run right before the garbage collector.

Toyoo · January 16, 2024, 1:06pm

I’m talking about the trash chore, not retain.

Alexey · January 17, 2024, 12:50am

Doesn’t matter, it’s periodic as well, the pieces:trash process running every hour, you may also change it

      --collector.interval duration                              how frequently expired pieces are
collected (default 1h0m0s)

Toyoo · January 17, 2024, 4:54am

I’m taking about the trash chore, not the collector chore.

Alexey · January 17, 2024, 7:20am

It’s trash chore as far as I know

Toyoo · January 17, 2024, 8:56pm

I wish this was documented in some sort of operations manual. Besides, naming seems to be quite inconsistent even in the source code.

What I am talking about is the trash chore, started here. You can see that its running interval is not configurable. This chore periodically scans the trash directory using the file walker procedure, looks for files older than 7 days, and removes them.

What I am not talking about is the retain service, popularly named GC/bloom filter, started here. This service scans the blobs directory using the file walker procedure, looks for files that do not match the bloom filter sent by a satellite, then moves these files to the trash directory.

What I am not talking about is the collector service, also called piece expiration something, started here. This one periodically scans the pieceexpiration.db database looking for pieces which have an expiration timestamp set at upload time, and this timestamp already passed.

These are three different parts of code with three different functions.