Yes but that was on the 13th. Since then it did not start again. For the other satellite I have
2024-07-14T22:54:25Z INFO lazyfilewalker.trash-cleanup-filewalker.subprocess trash-filewalker started {"Process": "storagenode", "satelliteID": "12L9ZFwhzVpuEKMUNUqkaTLGzwY9G24tbiigLiXpmZWKwmcNDDs", "Process": "storagenode", "dateBefore": "2024-07-07T22:54:25Z"}
2024-07-14T22:54:25Z INFO lazyfilewalker.trash-cleanup-filewalker.subprocess Database started {"Process": "storagenode", "satelliteID": "12L9ZFwhzVpuEKMUNUqkaTLGzwY9G24tbiigLiXpmZWKwmcNDDs", "Process": "storagenode"}
So why is there no trash cleanup on the 14th deleting everything before 7th?
And as said I have another node where date folder from the 5th is still there:
ls /storage/trash/ukfu6bhbboxilvt7jrwlqk7y2tapb5d2r2tsmj2sjxvw5qaaaaaa/
2024-07-05 2024-07-09 2024-07-10 2024-07-12 2024-07-13 2024-07-14 2024-07-15
There was one big Bloom filter on the node and moving to trash is still ongoing. I thought even while retain is running that old trash folders get deleted.
So maybe the other satellites start when this one has finished which is currently the last in y log:
2024-07-14T22:54:25Z INFO lazyfilewalker.trash-cleanup-filewalker.subprocess trash-filewalker started {"Process": "storagenode", "satelliteID": "12L9ZFwhzVpuEKMUNUqkaTLGzwY9G24tbiigLiXpmZWKwmcNDDs", "Process": "storagenode", "dateBefore": "2024-07-07T22:54:25Z"}
2024-07-14T22:54:25Z INFO lazyfilewalker.trash-cleanup-filewalker.subprocess Database started {"Process": "storagenode", "satelliteID": "12L9ZFwhzVpuEKMUNUqkaTLGzwY9G24tbiigLiXpmZWKwmcNDDs", "Process": "storagenode"}
I don’t see any progress in the date folder for this satellite. Still 1024 subfolders while running quite some time now.
ls /storage/trash/v4weeab67sbgvnbwd5z7tweqsqqun7qox2agpbxy44mqqaaaaaaa/* | wc -l
1024
So if this takes long all other deletions will be delayed as well? So you may have 7 days for the first satellite but 10 or 12 days when deletions for the last satellite start?
Oh I see progress here. I thought the subfolders do get deleted when they are empty but it appears they are not.
I see most of them empty and only some at the end still being full. So it seems folder contents are being deleted.
Ok let’s see if then the other satellites will follow.
I checked the other node that still has the folder from the 5th and it appears to be the same. Many subfolders empty already some are not.
But this shows that deletions can take ages and files are in trash much much longer than the 7 days that are exepected.
I am seeing the same, my HDD’s have a hard time to keep up with all the load from huge ingress, a lot of data in the trash that has to be deleted, deleting the TTL data and also processing new bloom filters at the same time…
Why don’t the subfolders get deleted when they are empty.
It would be much easier to monitor deletion progress. Instead of checking which folders are empty you would only have to keep track of the number of subfolders.
I do not know. But seems they are deleted only when they have had files in it.
At least my nodes does not have empty subfolders (except the Stefan satellite for the April, when the node get this update I suppose). And I didn’t delete anything manually there.
On retain it is easy, as this creates the subfolders one after the another.
For deletion it would be great, if after all files in a subfolder have been deleted, the corresponding subfolder gets deleted as well.
They are deleted, if there was at least a one file. Otherwise they will not be deleted, I believe.
I have no idea. Do they contains any file in them? Because my nodes doesn’t have these older folders for any trusted satellites, so likely a local issue. I.e. restart before the finish of the process (when the files are got deleted, but folders are not).
Since you do not have a log redirection (I can assume that, because you use the docker logs command), these lines likely were deleted with the one of the container’s recreations.
If you would have an older logs, we could a troubleshoot.
They are deleted, but as a part of the trash-filewalker process, if it was interrupted (when it was able to delete files, but not folders yet), they will remain forever. Maybe they would be deleted with the next run, but I’m clearly not sure, because I do not know, how to reproduce.
I would repeat, my nodes doesn’t have the empty trash folders in the folders of the trusted satellites.
I can assure you that when I take this path for example from a node:
It has following subfolders:
ls /trash/ukfu6bhbboxilvt7jrwlqk7y2tapb5d2r2tsmj2sjxvw5qaaaaaa/2024-07-10
hs hu hw hy ia ic ie ig ii ik im
ht hv hx hz ib id if ih ij il
The trash cleanup is running on folder hw while I am writing.
Folders up to hw are empty and not (yet?) deleted. Hs ,ht, hu, hv all empty but still present. So to me this looks like they do not get deleted once they have been emptied.
One more thing on the same node: As said, the trash cleanup is currently running on the date folder above, 10th of July. But this is not the oldest of the date folders.
These are the date folders:
The oldest date folder is the 5th. The trash cleanup should work on that folder, not on the 10th.
When I enter the 5th, I can see that some subfolders are empty, some are still full of files.
ls /trash/ukfu6bhbboxilvt7jrwlqk7y2tapb5d2r2tsmj2sjxvw5qaaaaaa/2024-07-05
a2 a5 aa ad ag aj am ap as av ay bb be bh bk bn bq bt
a3 a6 ab ae ah ak an aq at aw az bc bf bi bl bo br bu
a4 a7 ac af ai al ao ar au ax ba bd bg bj bm bp bs bv
So for example: Folder az is already emptied, folder ba still has files in it.
It might be that the cleanup was interrupted. But why did it not resume with the 5th July folder? Why is the cleanup now running on the 10th July date folder instead? This does not make sense to me. I cannot see any logic here as this now leaves me with half emptied folders the trash cleanup is not working on and therefore they don’t yet get deleted.
I don’t even know if they will ever get deleted.
Please don’t tell me that I have to cleanup manually after the trash cleanup. This does not make any sense.
I have assurance now that after being interrupted the cleanup will not resume with the correct date folder but picks the next one.
Exactly the node I was talking about before got interrupted. Now instead of resuming with the 10th of July date folder it is working in the 11th of July date folder. So the subfolders in the 10th are partially empty while some are not.
I don’t think this is how it should be. The cleanup should start or re-start with the oldest date folder and not with some date folder in between.
Cleanup of 10th is finished. While it was running no subfolder was deleted, only emptied.
After all subfolders have been emptied, the date folder got deleted.
Now after this it seems that the cleanup for this satellite is considered finished despite that numerous subfolders from earlier date folders have not been emptied and therefore not been deleted.
Currently it is working on the next satellite. Which probably means that now all other satellites get their cleanup in a row until it is the turn of the US1 satellite again. It may then start again with the oldest date folder, which is the 5th of July.
I don’t know if such an implementation makes sense. Why does it not resume on the same date folder when interrupted?
It would be good if each xx folder would get deleted immediately after it got emptied.
Also, the trash size in the dashboard should get updated after the deletion of each xx folder. What happens now if the trash cleanup gets interrupted half way through. I assume that the trash size doesn’t get updated in this case. At the next run it will delete the remaining half of the trash. Does the trash cleanup reduces the trash size in the dashboard correctly or did it forget about the previously interrupted run and only lowers it by the amount of the 2nd pass?