I think I have a trash accounting problem

jammerdan · November 28, 2024, 6:44am

I can imagine that this solution might even work in some cases. I might have different multiple issues on different nodes maybe.

Not really. but maybe as a last resort. First I think I need to understand more where the problems are. A closer look has revealed the nodes might have different multiple issues.

Ok so a normal complete startup scan will also scan the trash dirs? I wasn’t sure of that.

I found some issues so far:

On one node the trash deleter was not able to delete s piece file with a seemingly corrupted name. So the subfolder was not empty and the date folder did not get deleted. I have manually deleted the corrupted file and maybe that will get the node back on the track. Could it be that the deletion process does only delete piece files and nothing else? And if retain moves a weird piece into a subfolder then the deletion will no longer work?
The nodes might be simply slow. I have seen some context canceled errors for the used space walker. The question is, how is this error handled? Will it still stop-resume or will it in such a case start from the scratch?
Badger and startup scan is enabled an all of these nodes
Log lines for used-space filewalker look normal. I have a starting and a completing line for each satellite.
Now on one node I have this: 2024-11-26T18:54:03Z INFO pieces:trash emptying trash started {"Process": "storagenode", "Satellite ID": "12EayRS2V1kEsWESU9QMRseFhdxYxKicsiFmxrsLZHeLUtdps3S"} which is the last trash emptying starting line. Not finished yet. Does this mean it is running already for almost 2 days and can I expect that normally the trash space would be updated when I see the line that it is finished?

Still it could be an issue of the introduced bug that was posted. However I am having the issue that the trash does not really get down to an acceptable level for quite some months now and I was hoping it will fix itself when all the retaining and trash deletion will be less.