you may check for walker
and
in your logs
you may check for walker
and
in your logs
@thepaul @Alexey It actually managed to finish on one of the nodes on which it was taking quite long, it just took like four days to complete that one satellite with no significant I/O load, but that would be as expected with lazy filewalker I assume.
Do not have strace installed at the moment, but should I experience something out of the norm in the future I now know where to look.
Thank you both.
Any update on the patch? My node updated to v1.91.2 last week, but that did not fix the issue.
The fix is in v1.92. Your node should get it soon!
Do you have errors, related to gc-filewalker
, lazyfilewalker
, retain
in your logs?
I have finally gotten the update to 1.92 and a retain request from the US satellite. This has improved the situation and the retain request now finished:
storagenode |2023-12-06T12:50:04Z INFO retain Prepared to run a Retain request. {"process": "storagenode", "Created Before": "2023-11-29T17:59:59Z", "Filter Size": 2097155, "Satellite ID": "12EayRS2V1kEsWESU9QMRseFhdxYxKicsiFmxrsLZHeLUtdps3S"}
storagenode |2023-12-06T17:16:25Z WARN retain failed to delete piece {"process": "storagenode", "Satellite ID": "12EayRS2V1kEsWESU9QMRseFhdxYxKicsiFmxrsLZHeLUtdps3S", "Piece ID": "675OZBUCDUSRRA3XMAME62ARV4QX4UJXRMILIGWYEYGZDOFUR5NA", "error": "pieces error: filestore error: file does not exist", "errorVerbose": "pieces error: filestore error: file does not exist\n\tstorj.io/storj/storagenode/blobstore/filestore.(*blobStore).Stat:110\n\tstorj.io/storj/storagenode/pieces.(*BlobsUsageCache).pieceSizes:245\n\tstorj.io/storj/storagenode/pieces.(*BlobsUsageCache).Trash:290\n\tstorj.io/storj/storagenode/pieces.(*Store).Trash:404\n\tstorj.io/storj/storagenode/retain.(*Service).trash:364\n\tstorj.io/storj/storagenode/retain.(*Service).retainPieces:341\n\tstorj.io/storj/storagenode/retain.(*Service).Run.func2:221\n\tgolang.org/x/sync/errgroup.(*Group).Go.func1:75"}
[… many more failed to delete piece …]
storagenode |2023-12-06T22:50:10Z INFO retain Moved pieces to trash during retain {"process": "storagenode", "num deleted": 1862873, "Retain Status": "enabled"}
However, only ~235Gb files were move to the trash. That means that i still have 1.5Tb of garbage files from the US node on my disk:
│ SATELLITE │ HELD AMOUNT │ REPUTATION │ PAYOUT THIS MONTH │
│ Joined Month │ Total │ Disq Susp Down │ Storage Egress Repair/Aud Held Payout │
│ us1.storj.io:7777 (OK) │ │ │ $ 1.49/TBm $ 6.00/TB $ 6.00/TB 0% 100% │
│ 2019-05-14 56 │ $ 0.03 │ 0.00% 0.00% 0.56% │ $ 0.9507 $ 0.4243 $ 0.0712 -$ 0.0000 $ 1.4461 │
Disk Average So Far (debug): 2.94 TB >> 66% of expected 4.44 TB <<
Any chance that these are also going to be deleted with a future retain request? Are there maybe limits to how long a retain request can run or how much it can delete?
The bloom filter can capture not more 90% of the garbage for the one turn. Each satellite sends an own filter. So, to remove almost all garbage, it needs to pass several loops.