When will "Uncollected Garbage" be deleted?

jammerdan · September 9, 2024, 6:19pm

This is continuously and only for the Saltlake satellite.

Alexey · September 10, 2024, 3:47am

They are started automatically every hour, but only one would be executed at the same time, when the previous is completed, the next one will be started, and so on. So, actually the time intervals may differ on a duration of a previous deletion job.
If there is nothing to delete, the trash filewalker will just finish for that satellite allowing to start the next one from the queue.

$ grep "\strash" /mnt/x/storagenode2/storagenode.log | grep -E "started|finished"
...
2024-09-03T10:35:14Z    INFO    pieces:trash    emptying trash finished {"Process": "storagenode", "Satellite ID": "12EayRS2V1kEsWESU9QMRseFhdxYxKicsiFmxrsLZHeLUtdps3S", "elapsed": "16h19m58.7272235s"}
2024-09-03T18:15:13Z    INFO    pieces:trash    emptying trash started  {"Process": "storagenode", "Satellite ID": "12L9ZFwhzVpuEKMUNUqkaTLGzwY9G24tbiigLiXpmZWKwmcNDDs"}
2024-09-03T18:15:15Z    INFO    pieces:trash    emptying trash finished {"Process": "storagenode", "Satellite ID": "12L9ZFwhzVpuEKMUNUqkaTLGzwY9G24tbiigLiXpmZWKwmcNDDs", "elapsed": "1.4851948s"}
2024-09-03T18:15:15Z    INFO    pieces:trash    emptying trash started  {"Process": "storagenode", "Satellite ID": "1wFTAgs9DP5RSnCqKV1eLf6N9wtk4EAtmN5DpSxcs8EjT69tGE"}
2024-09-06T22:14:29Z    INFO    pieces:trash    emptying trash finished {"Process": "storagenode", "Satellite ID": "1wFTAgs9DP5RSnCqKV1eLf6N9wtk4EAtmN5DpSxcs8EjT69tGE", "elapsed": "75h59m15.9497338s"}
2024-09-06T22:14:29Z    INFO    pieces:trash    emptying trash started  {"Process": "storagenode", "Satellite ID": "121RTSDpyNZVcEU84Ticf2L1ntiuUimbWgfATz21tuvgk3vzoA6"}
2024-09-06T22:14:29Z    INFO    pieces:trash    emptying trash finished {"Process": "storagenode", "Satellite ID": "121RTSDpyNZVcEU84Ticf2L1ntiuUimbWgfATz21tuvgk3vzoA6", "elapsed": "5.3162ms"}
2024-09-06T22:14:29Z    INFO    pieces:trash    emptying trash started  {"Process": "storagenode", "Satellite ID": "12EayRS2V1kEsWESU9QMRseFhdxYxKicsiFmxrsLZHeLUtdps3S"}
2024-09-06T22:14:29Z    INFO    pieces:trash    emptying trash finished {"Process": "storagenode", "Satellite ID": "12EayRS2V1kEsWESU9QMRseFhdxYxKicsiFmxrsLZHeLUtdps3S", "elapsed": "5.3348ms"}
2024-09-06T22:14:29Z    INFO    pieces:trash    emptying trash started  {"Process": "storagenode", "Satellite ID": "12L9ZFwhzVpuEKMUNUqkaTLGzwY9G24tbiigLiXpmZWKwmcNDDs"}
2024-09-06T22:14:29Z    INFO    pieces:trash    emptying trash finished {"Process": "storagenode", "Satellite ID": "12L9ZFwhzVpuEKMUNUqkaTLGzwY9G24tbiigLiXpmZWKwmcNDDs", "elapsed": "4.5125ms"}
2024-09-06T22:14:29Z    INFO    pieces:trash    emptying trash started  {"Process": "storagenode", "Satellite ID": "121RTSDpyNZVcEU84Ticf2L1ntiuUimbWgfATz21tuvgk3vzoA6"}
2024-09-06T22:14:29Z    INFO    pieces:trash    emptying trash finished {"Process": "storagenode", "Satellite ID": "121RTSDpyNZVcEU84Ticf2L1ntiuUimbWgfATz21tuvgk3vzoA6", "elapsed": "4.6285ms"}
2024-09-06T22:14:29Z    INFO    pieces:trash    emptying trash started  {"Process": "storagenode", "Satellite ID": "1wFTAgs9DP5RSnCqKV1eLf6N9wtk4EAtmN5DpSxcs8EjT69tGE"}

Alexey · September 10, 2024, 4:01am

Have you checked the usage data on the pie chart and compared it with the data provided by the OS?
If these numbers doesn’t match (you need to use SI measure units (base 10) for the OS reports, not a binary ones (base 2)), then you need to restart the node to allow the used-space-filewalker fix the usage on the piechart.
For my nodes the reported used space registered on the satellites is pretty close to the used space on the piechart:

Rassah · September 10, 2024, 10:48am

Usage on disk and provided by Storj is the same. Saltlake is just ignoring all of it

and has been for two months now. I don’t know if it’s a saltlake issue or my node issue.

this is ap1.storj.io sattelite

us1.storj.io

eu1.storj.io

and finally the broken saltlake.tardigrade.io

pasatmalo · September 10, 2024, 12:01pm

This has been a general node and satellite issue, which has already been (partially) fixed.
Specifically, it has been two different issues:

(Part of the) TTL data failed to be deleted when it expired (node side issue, as far as I know, pending fix with a flat file expiration store).
Bloom filters were being generated with the expired pieces (satellite side issue, this has been fixed).

Because both methods of trash collection were overwhelmed/failed to properly delete the test data (which was all uploaded as TTL), it meant that expired, and therefore unpaid, pieces were not being deleted from the nodes and started to pile up.

Over the last 1-2 weeks, most (or all) nodes should have received new bloom filters which have gotten rid of the piled up unpaid data.

I notice from your screenshots that you dont seem to have much trash at all, which I find strange.

Have you noticed any errors in the logs? Do you know when you last received a bloom filter?

Rassah · September 10, 2024, 2:17pm

I’m not sure how to check the logs or bloom filters. I just know this command
docker logs -f --tail 20 storagenode
but not how or what to search for.

edo · September 10, 2024, 2:40pm

Hi @Rassah,

You could try to find retain log entries for SLC like this:

docker logs storagenode 2>&1 | grep "retain" | grep "1wFTAgs9DP5RSnCqKV1eLf6N9wtk4EAtmN5DpSxcs8EjT69tGE"

You may also check the contents of the retain folder to see if there any pending BFs as follows:

ls storagenode/retain

Rassah · September 10, 2024, 9:07pm

The retain folder is empty.

docker logs storagenode 2>&1 | grep “retain” returns

2024-09-09T04:30:42Z INFO retain Prepared to run a Retain request. {“Process”: “storagenode”, “cachePath”: “config/retain”, “Created Before”: “2024-09-03T17:59:59Z”, “Filter Size”: 103138, “Satellite ID”: “12EayRS2V1kEsWESU9QMRseFhdxYxKicsiFmxrsLZHeLUtdps3S”}
2024-09-09T04:34:52Z INFO retain Moved pieces to trash during retain {“Process”: “storagenode”, “cachePath”: “config/retain”, “Deleted pieces”: 94, “Failed to delete”: 0, “Pieces failed to read”: 0, “Pieces count”: 172576, “Satellite ID”: “12EayRS2V1kEsWESU9QMRseFhdxYxKicsiFmxrsLZHeLUtdps3S”, “Duration”: “4m9.964958489s”, “Retain Status”: “enabled”}
2024-09-09T23:23:43Z INFO retain Prepared to run a Retain request. {“Process”: “storagenode”, “cachePath”: “config/retain”, “Created Before”: “2024-09-05T17:59:59Z”, “Filter Size”: 7683, “Satellite ID”: “12L9ZFwhzVpuEKMUNUqkaTLGzwY9G24tbiigLiXpmZWKwmcNDDs”}
2024-09-09T23:23:52Z INFO retain Moved pieces to trash during retain {“Process”: “storagenode”, “cachePath”: “config/retain”, “Deleted pieces”: 300, “Failed to delete”: 0, “Pieces failed to read”: 0, “Pieces count”: 13190, “Satellite ID”: “12L9ZFwhzVpuEKMUNUqkaTLGzwY9G24tbiigLiXpmZWKwmcNDDs”, “Duration”: “9.318479143s”, “Retain Status”: “enabled”}
2024-09-10T10:05:25Z INFO retain Prepared to run a Retain request. {“Process”: “storagenode”, “cachePath”: “config/retain”, “Created Before”: “2024-09-04T17:59:59Z”, “Filter Size”: 103113, “Satellite ID”: “12EayRS2V1kEsWESU9QMRseFhdxYxKicsiFmxrsLZHeLUtdps3S”}
2024-09-10T10:07:39Z INFO retain Moved pieces to trash during retain {“Process”: “storagenode”, “cachePath”: “config/retain”, “Deleted pieces”: 34, “Failed to delete”: 0, “Pieces failed to read”: 0, “Pieces count”: 172482, “Satellite ID”: “12EayRS2V1kEsWESU9QMRseFhdxYxKicsiFmxrsLZHeLUtdps3S”, “Duration”: “2m13.418812901s”, “Retain Status”: “enabled”}

docker logs storagenode 2>&1 | grep “1wFTAgs9DP5RSnCqKV1eLf6N9wtk4EAtmN5DpSxcs8EjT69tGE” returns

2024-09-09T02:37:16Z 2024-09-09T02:37:16Z 2024-09-09T02:37:16Z 2024-09-09T02:37:16Z 2024-09-09T02:37:16Z 2024-09-09T02:37:16Z 2024-09-09T02:37:16Z 2024-09-09T02:37:16Z 2024-09-09T02:40:13Z 2024-09-09T02:44:02Z 2024-09-09T02:44:02Z 2024-09-09T02:44:02Z 2024-09-09T02:44:02Z 2024-09-09T05:57:36Z 2024-09-09T05:57:36Z 2024-09-09T06:40:43Z 2024-09-09T10:38:13Z 2024-09-09T14:42:22Z 2024-09-09T18:40:08Z 2024-09-09T22:39:00Z 2024-09-10T02:37:12Z 2024-09-10T02:37:12Z 2024-09-10T02:37:12Z 2024-09-10T02:37:12Z 2024-09-10T02:37:12Z 2024-09-10T02:37:12Z 2024-09-10T02:37:12Z 2024-09-10T02:37:12Z 2024-09-10T02:41:32Z 2024-09-10T06:40:44Z 2024-09-10T10:41:18Z 2024-09-10T14:37:51Z 2024-09-10T18:37:49Z INFO pieces:trash emptying trash started {“Process”: “storagenode”, “Satellite ID”: “1wFTAgs9DP5RSnCqKV1eLf6N9wtk4EAtmN5DpSxcs8EjT69tGE”}
INFO lazyfilewalker.trash-cleanup-filewalker starting subprocess {“Process”: “storagenode”, “satelliteID”: “1wFTAgs9DP5RSnCqKV1eLf6N9wtk4EAtmN5DpSxcs8EjT69tGE”}
INFO lazyfilewalker.trash-cleanup-filewalker subprocess started {“Process”: “storagenode”, “satelliteID”: “1wFTAgs9DP5RSnCqKV1eLf6N9wtk4EAtmN5DpSxcs8EjT69tGE”}
INFO lazyfilewalker.trash-cleanup-filewalker.subprocess trash-filewalker started {“Process”: “storagenode”, “satelliteID”: “1wFTAgs9DP5RSnCqKV1eLf6N9wtk4EAtmN5DpSxcs8EjT69tGE”, “Process”: “storagenode”, “dateBefore”: “2024-09-02T02:37:16Z”}
INFO lazyfilewalker.trash-cleanup-filewalker.subprocess Database started {“Process”: “storagenode”, “satelliteID”: “1wFTAgs9DP5RSnCqKV1eLf6N9wtk4EAtmN5DpSxcs8EjT69tGE”, “Process”: “storagenode”}
INFO lazyfilewalker.trash-cleanup-filewalker.subprocess trash-filewalker completed {“Process”: “storagenode”, “satelliteID”: “1wFTAgs9DP5RSnCqKV1eLf6N9wtk4EAtmN5DpSxcs8EjT69tGE”, “Process”: “storagenode”, “bytesDeleted”: 0, “numKeysDeleted”: 0}
INFO lazyfilewalker.trash-cleanup-filewalker subprocess finished successfully {“Process”: “storagenode”, “satelliteID”: “1wFTAgs9DP5RSnCqKV1eLf6N9wtk4EAtmN5DpSxcs8EjT69tGE”}
INFO pieces:trash emptying trash finished {“Process”: “storagenode”, “Satellite ID”: “1wFTAgs9DP5RSnCqKV1eLf6N9wtk4EAtmN5DpSxcs8EjT69tGE”, “elapsed”: “44.805906ms”}
INFO reputation:service node scores updated {“Process”: “storagenode”, “Satellite ID”: “1wFTAgs9DP5RSnCqKV1eLf6N9wtk4EAtmN5DpSxcs8EjT69tGE”, “Total Audits”: 5812, “Successful Audits”: 5812, “Audit Score”: 1, “Online Score”: 1, “Suspension Score”: 1, “Audit Score Delta”: 0, “Online Score Delta”: 0, “Suspension Score Delta”: 0}
INFO pieces used-space-filewalker started {“Process”: “storagenode”, “Satellite ID”: “1wFTAgs9DP5RSnCqKV1eLf6N9wtk4EAtmN5DpSxcs8EjT69tGE”}
INFO lazyfilewalker.used-space-filewalker starting subprocess {“Process”: “storagenode”, “satelliteID”: “1wFTAgs9DP5RSnCqKV1eLf6N9wtk4EAtmN5DpSxcs8EjT69tGE”}
INFO lazyfilewalker.used-space-filewalker subprocess started {“Process”: “storagenode”, “satelliteID”: “1wFTAgs9DP5RSnCqKV1eLf6N9wtk4EAtmN5DpSxcs8EjT69tGE”}
INFO lazyfilewalker.used-space-filewalker.subprocess Database started {“Process”: “storagenode”, “satelliteID”: “1wFTAgs9DP5RSnCqKV1eLf6N9wtk4EAtmN5DpSxcs8EjT69tGE”, “Process”: “storagenode”}
INFO lazyfilewalker.used-space-filewalker subprocess finished successfully {“Process”: “storagenode”, “satelliteID”: “1wFTAgs9DP5RSnCqKV1eLf6N9wtk4EAtmN5DpSxcs8EjT69tGE”}
INFO pieces used-space-filewalker completed {“Process”: “storagenode”, “Satellite ID”: “1wFTAgs9DP5RSnCqKV1eLf6N9wtk4EAtmN5DpSxcs8EjT69tGE”, “Lazy File Walker”: true, “Total Pieces Size”: 2982779847168, “Total Pieces Content Size”: 2971357336576}
INFO reputation:service node scores updated {“Process”: “storagenode”, “Satellite ID”: “1wFTAgs9DP5RSnCqKV1eLf6N9wtk4EAtmN5DpSxcs8EjT69tGE”, “Total Audits”: 5812, “Successful Audits”: 5812, “Audit Score”: 1, “Online Score”: 1, “Suspension Score”: 1, “Audit Score Delta”: 0, “Online Score Delta”: 0, “Suspension Score Delta”: 0}
INFO reputation:service node scores updated {“Process”: “storagenode”, “Satellite ID”: “1wFTAgs9DP5RSnCqKV1eLf6N9wtk4EAtmN5DpSxcs8EjT69tGE”, “Total Audits”: 5812, “Successful Audits”: 5812, “Audit Score”: 1, “Online Score”: 1, “Suspension Score”: 1, “Audit Score Delta”: 0, “Online Score Delta”: 0, “Suspension Score Delta”: 0}
INFO reputation:service node scores updated {“Process”: “storagenode”, “Satellite ID”: “1wFTAgs9DP5RSnCqKV1eLf6N9wtk4EAtmN5DpSxcs8EjT69tGE”, “Total Audits”: 5812, “Successful Audits”: 5812, “Audit Score”: 1, “Online Score”: 1, “Suspension Score”: 1, “Audit Score Delta”: 0, “Online Score Delta”: 0, “Suspension Score Delta”: 0}
INFO reputation:service node scores updated {“Process”: “storagenode”, “Satellite ID”: “1wFTAgs9DP5RSnCqKV1eLf6N9wtk4EAtmN5DpSxcs8EjT69tGE”, “Total Audits”: 5812, “Successful Audits”: 5812, “Audit Score”: 1, “Online Score”: 1, “Suspension Score”: 1, “Audit Score Delta”: 0, “Online Score Delta”: 0, “Suspension Score Delta”: 0}
INFO reputation:service node scores updated {“Process”: “storagenode”, “Satellite ID”: “1wFTAgs9DP5RSnCqKV1eLf6N9wtk4EAtmN5DpSxcs8EjT69tGE”, “Total Audits”: 5812, “Successful Audits”: 5812, “Audit Score”: 1, “Online Score”: 1, “Suspension Score”: 1, “Audit Score Delta”: 0, “Online Score Delta”: 0, “Suspension Score Delta”: 0}
INFO pieces:trash emptying trash started {“Process”: “storagenode”, “Satellite ID”: “1wFTAgs9DP5RSnCqKV1eLf6N9wtk4EAtmN5DpSxcs8EjT69tGE”}
INFO lazyfilewalker.trash-cleanup-filewalker starting subprocess {“Process”: “storagenode”, “satelliteID”: “1wFTAgs9DP5RSnCqKV1eLf6N9wtk4EAtmN5DpSxcs8EjT69tGE”}
INFO lazyfilewalker.trash-cleanup-filewalker subprocess started {“Process”: “storagenode”, “satelliteID”: “1wFTAgs9DP5RSnCqKV1eLf6N9wtk4EAtmN5DpSxcs8EjT69tGE”}
INFO lazyfilewalker.trash-cleanup-filewalker.subprocess trash-filewalker started {“Process”: “storagenode”, “satelliteID”: “1wFTAgs9DP5RSnCqKV1eLf6N9wtk4EAtmN5DpSxcs8EjT69tGE”, “Process”: “storagenode”, “dateBefore”: “2024-09-03T02:37:12Z”}
INFO lazyfilewalker.trash-cleanup-filewalker.subprocess Database started {“Process”: “storagenode”, “satelliteID”: “1wFTAgs9DP5RSnCqKV1eLf6N9wtk4EAtmN5DpSxcs8EjT69tGE”, “Process”: “storagenode”}
INFO lazyfilewalker.trash-cleanup-filewalker.subprocess trash-filewalker completed {“Process”: “storagenode”, “satelliteID”: “1wFTAgs9DP5RSnCqKV1eLf6N9wtk4EAtmN5DpSxcs8EjT69tGE”, “Process”: “storagenode”, “bytesDeleted”: 0, “numKeysDeleted”: 0}
INFO lazyfilewalker.trash-cleanup-filewalker subprocess finished successfully {“Process”: “storagenode”, “satelliteID”: “1wFTAgs9DP5RSnCqKV1eLf6N9wtk4EAtmN5DpSxcs8EjT69tGE”}
INFO pieces:trash emptying trash finished {“Process”: “storagenode”, “Satellite ID”: “1wFTAgs9DP5RSnCqKV1eLf6N9wtk4EAtmN5DpSxcs8EjT69tGE”, “elapsed”: “38.458666ms”}
INFO reputation:service node scores updated {“Process”: “storagenode”, “Satellite ID”: “1wFTAgs9DP5RSnCqKV1eLf6N9wtk4EAtmN5DpSxcs8EjT69tGE”, “Total Audits”: 5812, “Successful Audits”: 5812, “Audit Score”: 1, “Online Score”: 1, “Suspension Score”: 1, “Audit Score Delta”: 0, “Online Score Delta”: 0, “Suspension Score Delta”: 0}
INFO reputation:service node scores updated {“Process”: “storagenode”, “Satellite ID”: “1wFTAgs9DP5RSnCqKV1eLf6N9wtk4EAtmN5DpSxcs8EjT69tGE”, “Total Audits”: 5812, “Successful Audits”: 5812, “Audit Score”: 1, “Online Score”: 1, “Suspension Score”: 1, “Audit Score Delta”: 0, “Online Score Delta”: 0, “Suspension Score Delta”: 0}
INFO reputation:service node scores updated {“Process”: “storagenode”, “Satellite ID”: “1wFTAgs9DP5RSnCqKV1eLf6N9wtk4EAtmN5DpSxcs8EjT69tGE”, “Total Audits”: 5812, “Successful Audits”: 5812, “Audit Score”: 1, “Online Score”: 1, “Suspension Score”: 1, “Audit Score Delta”: 0, “Online Score Delta”: 0, “Suspension Score Delta”: 0}
INFO reputation:service node scores updated {“Process”: “storagenode”, “Satellite ID”: “1wFTAgs9DP5RSnCqKV1eLf6N9wtk4EAtmN5DpSxcs8EjT69tGE”, “Total Audits”: 5812, “Successful Audits”: 5812, “Audit Score”: 1, “Online Score”: 1, “Suspension Score”: 1, “Audit Score Delta”: 0, “Online Score Delta”: 0, “Suspension Score Delta”: 0}
INFO reputation:service node scores updated {“Process”: “storagenode”, “Satellite ID”: “1wFTAgs9DP5RSnCqKV1eLf6N9wtk4EAtmN5DpSxcs8EjT69tGE”, “Total Audits”: 5812, “Successful Audits”: 5812, “Audit Score”: 1, “Online Score”: 1, “Suspension Score”: 1, “Audit Score Delta”: 0, “Online Score Delta”: 0, “Suspension Score Delta”: 0}

Alexey · September 11, 2024, 4:24am

Please also remove the data of untrusted satellites: How To Forget Untrusted Satellites

Seems your node didn’t receive any report from the Saltlake satellite
This is for Saltlake on my nodes:

This suggests that your node maybe disqualified or suspended on that satellite.
Because mine nodes have something

Does your node stores something for Saltlake? You may calculate the usage

du --si -s /mnt/x/storagenode/storage/blobs/pmw6tvzmf2jv6giyybmmvl4o2ahqlaldsaeha4yx74n5aaaaaaaa

Aitor · September 11, 2024, 12:30pm

My trash isn’t collected after more than 15 days, should I do anything or just wait?

donald.m.motsinger · September 11, 2024, 2:04pm

I guess it’s still in the process of emptying it.

Look for “emptying trash” in your logs. If the last line contains “started”, then it’s still running.

EasyRhino · September 11, 2024, 3:37pm

and if there’s a lot of trash it can take forever. I have one node that is very slow, and has a bunch of trash, and a “emptying trash” from just salt lake satellite took 42h to run.

It may be faster to manually delete the trash by nuking old days from the trash folder directly in the operating system. But you would want to restart your node after doing it because used spacecalculations will be off (and restart will re-run the used space filewalker)

nerdatwork · September 11, 2024, 3:45pm

Can you check the timestamp for the folders to get an idea if pieces are getting deleted or not ?

lovaszl · September 11, 2024, 5:18pm

September is about garbage collection for me, 11 days of trying to solve my 13.5TB node. In the last few days, I’ve been trying to speed things up because the load has cut my expected money in half. Now I’m trying to reduce the shared space in the config.
The question is, how normal is it that the used-space-filewalker, retain and piece:trash processes run concurrently on that satellite?

donald.m.motsinger · September 11, 2024, 5:20pm

42h is not “forever”

I had 1/4 of my trash emptied in 7 days. Then I changed to the method with find ... -delete as described few posts above.

Alexey · September 12, 2024, 4:19am

Yes, they are separate processes and you can have a lucky time when they may start in the same time. If you also use a lazy mode, the used-space-filewalker can be spawned for all satellites in the same time.
Moreover, if you have specified a retain concurrency more than 1, the retain may happen for more than one satellite too.

Aitor · September 12, 2024, 6:57am

I have a folder from 30 august.

Aitor · September 12, 2024, 7:00am

I also have a lot of process for trash, so I think I’m going to wait.

EasyRhino · September 12, 2024, 7:24am

[quote=“donald.m.motsinger, post:458, topic:27327, full:true”]

True, but that 42 hour was only to delete about 300GB of data. I still have other trash getting emptied as we speak.

agente · September 12, 2024, 8:42am

2024-08-25 trash directory. I’m still here:
drwx------ 2 root root 3416064 Sep 12 07:56 nz
drwx------ 2 root root 3391488 Aug 27 19:28 o2

6/7tb of trash

ps: only ext4 nodes. zfs deleted in less than a day