hello.
Let’s hope that is the problem and it is solved quickly.
[quote=“Mitsos, post:41, topic:25936”]
Update on the trash saga: turns out that prefix-named folders aren’t getting deleted.
[/quote]
hello.
Let’s hope that is the problem and it is solved quickly.
[quote=“Mitsos, post:41, topic:25936”]
Update on the trash saga: turns out that prefix-named folders aren’t getting deleted.
[/quote]
The node is running version 1.102.3
I have exactly the same problem on 40 nodes! When will the trash bin be emptied? The trash has only been increasing for 3 weeks now. 102.4
Can anyone with this problem let us know which of the following log messages they can find? I’d like to see the full log lines.
emptying trash started
emptying trash failed
emptying trash finished
trash-filewalker started
trash-filewalker completed
Edited to add: If the undeleted trash directories are causing problems, you can rm -rf
them yourself. If the timestamp from the directory name is more than 7 days ago, this will be 100% safe to do. Obviously the node software should be doing it for you, but this is a workaround you can use until we figure out what’s wrong and fix it.
Apr 29 00:48:56 server1 node8[935537]: 2024-04-29T00:48:56+03:00 INFO lazyfilewalker.trash-cleanup-filewalker starting subprocess {"Process": "storagenode", "satelliteID": "12L9ZFwhzVpuEKMUNUqkaTLGzwY9G24tbiigLiXpmZWKwmcNDDs"}
Apr 29 00:48:56 server1 node8[935537]: 2024-04-29T00:48:56+03:00 INFO lazyfilewalker.trash-cleanup-filewalker subprocess started {"Process": "storagenode", "satelliteID": "12L9ZFwhzVpuEKMUNUqkaTLGzwY9G24tbiigLiXpmZWKwmcNDDs"}
Apr 29 00:48:56 server1 node8[935537]: 2024-04-29T00:48:56+03:00 INFO lazyfilewalker.trash-cleanup-filewalker.subprocess trash-filewalker started {"Process": "storagenode", "satelliteID": "12L9ZFwhzVpuEKMUNUqkaTLGzwY9G24tbiigLiXpmZWKwmcNDDs", "Process": "storagenode", "dateBefore": "2024-04-22T00:48:56+03:00"}
Apr 29 00:48:56 server1 node8[935537]: 2024-04-29T00:48:56+03:00 INFO lazyfilewalker.trash-cleanup-filewalker.subprocess Database started {"Process": "storagenode", "satelliteID": "12L9ZFwhzVpuEKMUNUqkaTLGzwY9G24tbiigLiXpmZWKwmcNDDs", "Process": "storagenode"}
Apr 29 00:48:56 server1 node8[935537]: 2024-04-29T00:48:56+03:00 INFO lazyfilewalker.trash-cleanup-filewalker.subprocess trash-filewalker completed {"Process": "storagenode", "satelliteID": "12L9ZFwhzVpuEKMUNUqkaTLGzwY9G24tbiigLiXpmZWKwmcNDDs", "Process": "storagenode", "bytesDeleted": 0, "numKeysDeleted": 0}
Apr 29 00:48:56 server1 node8[935537]: 2024-04-29T00:48:56+03:00 INFO lazyfilewalker.trash-cleanup-filewalker subprocess finished successfully {"Process": "storagenode", "satelliteID": "12L9ZFwhzVpuEKMUNUqkaTLGzwY9G24tbiigLiXpmZWKwmcNDDs"}
Apr 29 00:48:56 server1 node8[935537]: 2024-04-29T00:48:56+03:00 INFO lazyfilewalker.trash-cleanup-filewalker starting subprocess {"Process": "storagenode", "satelliteID": "121RTSDpyNZVcEU84Ticf2L1ntiuUimbWgfATz21tuvgk3vzoA6"}
Apr 29 00:48:56 server1 node8[935537]: 2024-04-29T00:48:56+03:00 INFO lazyfilewalker.trash-cleanup-filewalker subprocess started {"Process": "storagenode", "satelliteID": "121RTSDpyNZVcEU84Ticf2L1ntiuUimbWgfATz21tuvgk3vzoA6"}
Apr 29 00:48:56 server1 node8[935537]: 2024-04-29T00:48:56+03:00 INFO lazyfilewalker.trash-cleanup-filewalker.subprocess trash-filewalker started {"Process": "storagenode", "satelliteID": "121RTSDpyNZVcEU84Ticf2L1ntiuUimbWgfATz21tuvgk3vzoA6", "Process": "storagenode", "dateBefore": "2024-04-22T00:48:56+03:00"}
Apr 29 00:48:56 server1 node8[935537]: 2024-04-29T00:48:56+03:00 INFO lazyfilewalker.trash-cleanup-filewalker.subprocess Database started {"Process": "storagenode", "satelliteID": "121RTSDpyNZVcEU84Ticf2L1ntiuUimbWgfATz21tuvgk3vzoA6", "Process": "storagenode"}
Apr 29 00:48:56 server1 node8[935537]: 2024-04-29T00:48:56+03:00 INFO lazyfilewalker.trash-cleanup-filewalker.subprocess trash-filewalker completed {"Process": "storagenode", "satelliteID": "121RTSDpyNZVcEU84Ticf2L1ntiuUimbWgfATz21tuvgk3vzoA6", "Process": "storagenode", "bytesDeleted": 0, "numKeysDeleted": 0}
Apr 29 00:48:56 server1 node8[935537]: 2024-04-29T00:48:56+03:00 INFO lazyfilewalker.trash-cleanup-filewalker subprocess finished successfully {"Process": "storagenode", "satelliteID": "121RTSDpyNZVcEU84Ticf2L1ntiuUimbWgfATz21tuvgk3vzoA6"}
Apr 29 00:48:56 server1 node8[935537]: 2024-04-29T00:48:56+03:00 INFO lazyfilewalker.trash-cleanup-filewalker starting subprocess {"Process": "storagenode", "satelliteID": "1wFTAgs9DP5RSnCqKV1eLf6N9wtk4EAtmN5DpSxcs8EjT69tGE"}
Apr 29 00:48:56 server1 node8[935537]: 2024-04-29T00:48:56+03:00 INFO lazyfilewalker.trash-cleanup-filewalker subprocess started {"Process": "storagenode", "satelliteID": "1wFTAgs9DP5RSnCqKV1eLf6N9wtk4EAtmN5DpSxcs8EjT69tGE"}
Apr 29 00:48:56 server1 node8[935537]: 2024-04-29T00:48:56+03:00 INFO lazyfilewalker.trash-cleanup-filewalker.subprocess trash-filewalker started {"Process": "storagenode", "satelliteID": "1wFTAgs9DP5RSnCqKV1eLf6N9wtk4EAtmN5DpSxcs8EjT69tGE", "Process": "storagenode", "dateBefore": "2024-04-22T00:48:56+03:00"}
Apr 29 00:48:56 server1 node8[935537]: 2024-04-29T00:48:56+03:00 INFO lazyfilewalker.trash-cleanup-filewalker.subprocess Database started {"Process": "storagenode", "satelliteID": "1wFTAgs9DP5RSnCqKV1eLf6N9wtk4EAtmN5DpSxcs8EjT69tGE", "Process": "storagenode"}
Apr 29 00:48:56 server1 node8[935537]: 2024-04-29T00:48:56+03:00 INFO lazyfilewalker.trash-cleanup-filewalker.subprocess trash-filewalker completed {"Process": "storagenode", "satelliteID": "1wFTAgs9DP5RSnCqKV1eLf6N9wtk4EAtmN5DpSxcs8EjT69tGE", "Process": "storagenode", "bytesDeleted": 0, "numKeysDeleted": 0}
Apr 29 00:48:56 server1 node8[935537]: 2024-04-29T00:48:56+03:00 INFO lazyfilewalker.trash-cleanup-filewalker subprocess finished successfully {"Process": "storagenode", "satelliteID": "1wFTAgs9DP5RSnCqKV1eLf6N9wtk4EAtmN5DpSxcs8EjT69tGE"}
Apr 29 00:48:56 server1 node8[935537]: 2024-04-29T00:48:56+03:00 INFO lazyfilewalker.trash-cleanup-filewalker starting subprocess {"Process": "storagenode", "satelliteID": "12EayRS2V1kEsWESU9QMRseFhdxYxKicsiFmxrsLZHeLUtdps3S"}
Apr 29 00:48:56 server1 node8[935537]: 2024-04-29T00:48:56+03:00 INFO lazyfilewalker.trash-cleanup-filewalker subprocess started {"Process": "storagenode", "satelliteID": "12EayRS2V1kEsWESU9QMRseFhdxYxKicsiFmxrsLZHeLUtdps3S"}
Apr 29 00:48:56 server1 node8[935537]: 2024-04-29T00:48:56+03:00 INFO lazyfilewalker.trash-cleanup-filewalker.subprocess trash-filewalker started {"Process": "storagenode", "satelliteID": "12EayRS2V1kEsWESU9QMRseFhdxYxKicsiFmxrsLZHeLUtdps3S", "Process": "storagenode", "dateBefore": "2024-04-22T00:48:56+03:00"}
Apr 29 00:48:56 server1 node8[935537]: 2024-04-29T00:48:56+03:00 INFO lazyfilewalker.trash-cleanup-filewalker.subprocess Database started {"Process": "storagenode", "satelliteID": "12EayRS2V1kEsWESU9QMRseFhdxYxKicsiFmxrsLZHeLUtdps3S", "Process": "storagenode"}
Apr 29 00:48:57 server1 node8[935537]: 2024-04-29T00:48:57+03:00 INFO lazyfilewalker.trash-cleanup-filewalker.subprocess trash-filewalker completed {"Process": "storagenode", "satelliteID": "12EayRS2V1kEsWESU9QMRseFhdxYxKicsiFmxrsLZHeLUtdps3S", "Process": "storagenode", "bytesDeleted": 0, "numKeysDeleted": 0}
Apr 29 00:48:57 server1 node8[935537]: 2024-04-29T00:48:57+03:00 INFO lazyfilewalker.trash-cleanup-filewalker subprocess finished successfully {"Process": "storagenode", "satelliteID": "12EayRS2V1kEsWESU9QMRseFhdxYxKicsiFmxrsLZHeLUtdps3S"}
Thank you. Does it perhaps work when you disable the lazy filewalkers?
I think that would look like
pieces.enable-lazy-filewalker: false
in the config.yaml or
STORJ_PIECES_ENABLE_LAZY_FILEWALKER=false
as an environment variable.
No difference, prefix-named aren’t being deleted on restart.
Apr 29 01:09:20 server1 node8[938354]: 2024-04-29T01:09:20+03:00 INFO pieces:trash emptying trash started {"Process": "storagenode", "Satellite ID": "12L9ZFwhzVpuEKMUNUqkaTLGzwY9G24tbiigLiXpmZWKwmcNDDs"}
Apr 29 01:09:20 server1 node8[938354]: 2024-04-29T01:09:20+03:00 INFO pieces:trash emptying trash finished {"Process": "storagenode", "Satellite ID": "12L9ZFwhzVpuEKMUNUqkaTLGzwY9G24tbiigLiXpmZWKwmcNDDs", "elapsed": "13.276831ms"}
Apr 29 01:09:20 server1 node8[938354]: 2024-04-29T01:09:20+03:00 INFO pieces:trash emptying trash started {"Process": "storagenode", "Satellite ID": "121RTSDpyNZVcEU84Ticf2L1ntiuUimbWgfATz21tuvgk3vzoA6"}
Apr 29 01:09:20 server1 node8[938354]: 2024-04-29T01:09:20+03:00 INFO pieces:trash emptying trash finished {"Process": "storagenode", "Satellite ID": "121RTSDpyNZVcEU84Ticf2L1ntiuUimbWgfATz21tuvgk3vzoA6", "elapsed": "31.778091ms"}
Apr 29 01:09:20 server1 node8[938354]: 2024-04-29T01:09:20+03:00 INFO pieces:trash emptying trash started {"Process": "storagenode", "Satellite ID": "1wFTAgs9DP5RSnCqKV1eLf6N9wtk4EAtmN5DpSxcs8EjT69tGE"}
Apr 29 01:09:20 server1 node8[938354]: 2024-04-29T01:09:20+03:00 INFO pieces:trash emptying trash finished {"Process": "storagenode", "Satellite ID": "1wFTAgs9DP5RSnCqKV1eLf6N9wtk4EAtmN5DpSxcs8EjT69tGE", "elapsed": "459.802µs"}
Apr 29 01:09:20 server1 node8[938354]: 2024-04-29T01:09:20+03:00 INFO pieces:trash emptying trash started {"Process": "storagenode", "Satellite ID": "12EayRS2V1kEsWESU9QMRseFhdxYxKicsiFmxrsLZHeLUtdps3S"}
Apr 29 01:09:20 server1 node8[938354]: 2024-04-29T01:09:20+03:00 INFO pieces:trash emptying trash finished {"Process": "storagenode", "Satellite ID": "12EayRS2V1kEsWESU9QMRseFhdxYxKicsiFmxrsLZHeLUtdps3S", "elapsed": "9.318982ms"}
Thank you again. That narrows down the problem.
My next ask: can someone run with debug logging enabled? There aren’t a lot of debug log lines in this code, but there are a few that might explain what’s going on. In particular, I’m looking for a line reading directory not found
.
Let me add these:
2024-04-28T19:22:28Z INFO pieces:trash emptying trash started {"Process": "storagenode", "Satellite ID": "1wFTAgs9DP5RSnCqKV1eLf6N9wtk4EAtmN5DpSxcs8EjT69tGE"}
2024-04-28T19:22:28Z ERROR pieces:trash emptying trash failed {"Process": "storagenode", "error": "pieces error: filestore error: remove storage/trash/pmw6tvzmf2jv6giyybmmvl4o2ahqlaldsaeha4yx74n5aaaaaaaa/2024-04-13: directory not empty", "errorVerbose": "pieces error: filestore error: remove storage/trash/pmw6tvzmf2jv6giyybmmvl4o2ahqlaldsaeha4yx74n5aaaaaaaa/2024-04-13: directory not empty\n\tstorj.io/storj/storagenode/blobstore/filestore.(*blobStore).EmptyTrash:189\n\tstorj.io/storj/storagenode/pieces.(*BlobsUsageCache).EmptyTrash:319\n\tstorj.io/storj/storagenode/pieces.(*Store).EmptyTrash:431\n\tstorj.io/storj/storagenode/pieces.(*TrashChore).Run.func1.1:84\n\tstorj.io/common/sync2.(*Workplace).Start.func1:89"}
2024-04-28T19:22:28Z INFO pieces:trash emptying trash started {"Process": "storagenode", "Satellite ID": "12L9ZFwhzVpuEKMUNUqkaTLGzwY9G24tbiigLiXpmZWKwmcNDDs"}
2024-04-28T19:22:28Z ERROR pieces:trash emptying trash failed {"Process": "storagenode", "error": "pieces error: filestore error: remove storage/trash/v4weeab67sbgvnbwd5z7tweqsqqun7qox2agpbxy44mqqaaaaaaa/2024-04-13: directory not empty", "errorVerbose": "pieces error: filestore error: remove storage/trash/v4weeab67sbgvnbwd5z7tweqsqqun7qox2agpbxy44mqqaaaaaaa/2024-04-13: directory not empty\n\tstorj.io/storj/storagenode/blobstore/filestore.(*blobStore).EmptyTrash:189\n\tstorj.io/storj/storagenode/pieces.(*BlobsUsageCache).EmptyTrash:319\n\tstorj.io/storj/storagenode/pieces.(*Store).EmptyTrash:431\n\tstorj.io/storj/storagenode/pieces.(*TrashChore).Run.func1.1:84\n\tstorj.io/common/sync2.(*Workplace).Start.func1:89"}
2024-04-28T19:22:28Z INFO pieces:trash emptying trash started {"Process": "storagenode", "Satellite ID": "121RTSDpyNZVcEU84Ticf2L1ntiuUimbWgfATz21tuvgk3vzoA6"}
2024-04-28T19:22:28Z ERROR pieces:trash emptying trash failed {"Process": "storagenode", "error": "pieces error: filestore error: remove storage/trash/qstuylguhrn2ozjv4h2c6xpxykd622gtgurhql2k7k75wqaaaaaa/2024-04-13: directory not empty", "errorVerbose": "pieces error: filestore error: remove storage/trash/qstuylguhrn2ozjv4h2c6xpxykd622gtgurhql2k7k75wqaaaaaa/2024-04-13: directory not empty\n\tstorj.io/storj/storagenode/blobstore/filestore.(*blobStore).EmptyTrash:189\n\tstorj.io/storj/storagenode/pieces.(*BlobsUsageCache).EmptyTrash:319\n\tstorj.io/storj/storagenode/pieces.(*Store).EmptyTrash:431\n\tstorj.io/storj/storagenode/pieces.(*TrashChore).Run.func1.1:84\n\tstorj.io/common/sync2.(*Workplace).Start.func1:89"}
2024-04-28T19:22:28Z INFO pieces:trash emptying trash started {"Process": "storagenode", "Satellite ID": "12EayRS2V1kEsWESU9QMRseFhdxYxKicsiFmxrsLZHeLUtdps3S"}
2024-04-28T19:22:28Z ERROR pieces:trash emptying trash failed {"Process": "storagenode", "error": "pieces error: filestore error: remove storage/trash/ukfu6bhbboxilvt7jrwlqk7y2tapb5d2r2tsmj2sjxvw5qaaaaaa/2024-04-13: directory not empty", "errorVerbose": "pieces error: filestore error: remove storage/trash/ukfu6bhbboxilvt7jrwlqk7y2tapb5d2r2tsmj2sjxvw5qaaaaaa/2024-04-13: directory not empty\n\tstorj.io/storj/storagenode/blobstore/filestore.(*blobStore).EmptyTrash:189\n\tstorj.io/storj/storagenode/pieces.(*BlobsUsageCache).EmptyTrash:319\n\tstorj.io/storj/storagenode/pieces.(*Store).EmptyTrash:431\n\tstorj.io/storj/storagenode/pieces.(*TrashChore).Run.func1.1:84\n\tstorj.io/common/sync2.(*Workplace).Start.func1:89"}
That is one node that has the nested directories in trash like: /storage/trash/ukfu6bhbboxilvt7jrwlqk7y2tapb5d2r2tsmj2sjxvw5qaaaaaa/2024-04-13/2024-04-13/2024-04-13/2a/
Obviously they do not get deleted because they are not empty as there are the nested directories in it. If it is 100% safe to delete those folders after 7 days there should be no need to check if they are empty or not.
Version was 1.99.3 and it seems it is now 1.102.3 since today or yesterday. If that version already contains a fix for that, maybe the nested folders will get removed on the next run of the trash chore. If it has not been fixed yet, then that’s the actual status.
Are you sure about this? Using the command ‘curl -sL http://ip:port/api/sno/satellites’, I get the averageUsageBytes value as 4043689674570.715, and on the left graph, it’s represented as 4.04TB. So, I guess this bug was already fixed.
Debug doesn’t show any directory not found. Other than the previous log lines (start, finish) these are there:
Apr 29 10:47:58 server1 node8[1012532]: 2024-04-29T10:47:58+03:00 DEBUG blobscache deleted piece {"Process": "storagenode", "Satellite ID": "12EayRS2V1kEsWESU9QMRseFhdxYxKicsiFmxrsLZHeLUtdps3S", "disk space freed in bytes": 768}
Apr 29 10:47:58 server1 node8[1012532]: 2024-04-29T10:47:58+03:00 DEBUG pieces deleted piece {"Process": "storagenode", "Satellite ID": "12EayRS2V1kEsWESU9QMRseFhdxYxKicsiFmxrsLZHeLUtdps3S", "Piece ID": "M4ZLCYCHI4A6IHNOAFHVSHYZO2X7IZNNHTV3K3Y5CCVYYOH3SQWA"}
Apr 29 10:47:58 server1 node8[1012532]: 2024-04-29T10:47:58+03:00 INFO collector deleted expired piece {"Process": "storagenode", "Satellite ID": "12EayRS2V1kEsWESU9QMRseFhdxYxKicsiFmxrsLZHeLUtdps3S", "Piece ID": "M4ZLCYCHI4A6IHNOAFHVSHYZO2X7IZNNHTV3K3Y5CCVYYOH3SQWA"}
Apr 29 10:47:58 server1 node8[1012532]: 2024-04-29T10:47:58+03:00 DEBUG blobscache deleted piece {"Process": "storagenode", "Satellite ID": "12L9ZFwhzVpuEKMUNUqkaTLGzwY9G24tbiigLiXpmZWKwmcNDDs", "disk space freed in bytes": 145408}
Apr 29 10:47:58 server1 node8[1012532]: 2024-04-29T10:47:58+03:00 DEBUG pieces deleted piece {"Process": "storagenode", "Satellite ID": "12L9ZFwhzVpuEKMUNUqkaTLGzwY9G24tbiigLiXpmZWKwmcNDDs", "Piece ID": "ARSUG3EUGDWTUTEFBNSKBI45TU7TIINBIBQ7UXQ5F5ULLEJIBRSQ"}
Apr 29 10:47:58 server1 node8[1012532]: 2024-04-29T10:47:58+03:00 INFO collector deleted expired piece {"Process": "storagenode", "Satellite ID": "12L9ZFwhzVpuEKMUNUqkaTLGzwY9G24tbiigLiXpmZWKwmcNDDs", "Piece ID": "ARSUG3EUGDWTUTEFBNSKBI45TU7TIINBIBQ7UXQ5F5ULLEJIBRSQ"}
Huh, I wasn’t aware of those. That would definitely break trash collection. That means the per-day dir transition ran more than once, but it’s not clear how that could happen. If you haven’t deleted them yet, could you show the timestamps on:
/storage/trash/ukfu6bhbboxilvt7jrwlqk7y2tapb5d2r2tsmj2sjxvw5qaaaaaa/2024-04-13
/storage/trash/ukfu6bhbboxilvt7jrwlqk7y2tapb5d2r2tsmj2sjxvw5qaaaaaa/2024-04-13/2024-04-13
/storage/trash/ukfu6bhbboxilvt7jrwlqk7y2tapb5d2r2tsmj2sjxvw5qaaaaaa/2024-04-13/2024-04-13/2024-04-13
/storage/trash/.trash-uses-day-dirs-indicator
(preferably with stat(1)
or something like it to show subsecond precision)?
If the higher layers and the indicator file all have birth times within a few microseconds of each other, then perhaps there are multiple things “opening” the blobstore layer at the same time for some reason.
It’s 100% safe for you, the operator, as long as you know things like: you haven’t put anything in there yourself, there aren’t any bind mounts under that point, etc. For the part of the storagenode software, it has to avoid deleting things that aren’t expected, in case of configuration mistakes or a bug causing confusion about where important files are.
Then you are probably also not aware of my other trash issue here:
Great.
I have it on more than one node and others have it as well.
Unfortunately birth time does not get recorded. It has only access, modify and change time.
/trash/.trash-uses-day-dirs-indicator
Access: 2024-04-13 06:23:09.153704449 +0300
Modify: 2024-04-13 06:23:09.153704449 +0300
Change: 2024-04-13 06:23:09.153704449 +0300
/trash/ukfu6bhbboxilvt7jrwlqk7y2tapb5d2r2tsmj2sjxvw5qaaaaaa/2024-04-13
Access: 2024-04-13 05:12:47.112192010 +0300
Modify: 2024-04-13 05:20:39.930751195 +0300
Change: 2024-04-13 06:23:05.453527813 +0300
trash/ukfu6bhbboxilvt7jrwlqk7y2tapb5d2r2tsmj2sjxvw5qaaaaaa/2024-04-13/2024-04-13/
Access: 2024-04-13 03:18:03.539695399 +0300
Modify: 2024-04-13 03:18:03.539695399 +0300
Change: 2024-04-13 05:20:39.930751195 +0300
/trash/ukfu6bhbboxilvt7jrwlqk7y2tapb5d2r2tsmj2sjxvw5qaaaaaa/2024-04-13/2024-04-13/2024-04-13/
Access: 2022-11-18 16:12:12.827925254 +0200
Modify: 2022-07-17 12:28:09.453219325 +0300
Change: 2024-04-13 03:18:03.539695399 +0300
I don’t know if it is helpful without birth date.
Fine then. But I don’t understand why you would expect empty trash date folders. According to @elek the idea is to simply delete the folders that are old enough:
And was there a bug that did not even delete empty date folders:
But well, then you need to find a way to tell the node software that it is fine to delete these nested folders and make the trash collection work again on the affected nodes. As I have these on more than on one node and I checked only a few when that update that performed the migration had been applied, I cannot check every node and perform manual cleanup of the erroneous folder structures.
Why are the pieces deleted in the dated trash folders, and not just remove the entire folder?
This way you won’t have undeleted empty folders. Just check the name of the folder and delete it if it’s too old. No need for another piece scan in trash too.
Here is another one:
2024-05-01T06:40:43Z DEBUG pieces:trash starting to empty trash {"Process": "storagenode"}
2024-05-01T06:40:43Z INFO pieces:trash emptying trash started {"Process": "storagenode", "Satellite ID": "12L9ZFwhzVpuEKMUNUqkaTLGzwY9G24tbiigLiXpmZWKwmcNDDs"}
2024-05-01T06:40:43Z INFO lazyfilewalker.trash-cleanup-filewalker starting subprocess {"Process": "storagenode", "satelliteID": "12L9ZFwhzVpuEKMUNUqkaTLGzwY9G24tbiigLiXpmZWKwmcNDDs"}
2024-05-01T06:40:43Z INFO lazyfilewalker.trash-cleanup-filewalker subprocess started {"Process": "storagenode", "satelliteID": "12L9ZFwhzVpuEKMUNUqkaTLGzwY9G24tbiigLiXpmZWKwmcNDDs"}
2024-05-01T06:40:43Z INFO lazyfilewalker.trash-cleanup-filewalker.subprocess trash-filewalker started {"Process": "storagenode", "satelliteID": "12L9ZFwhzVpuEKMUNUqkaTLGzwY9G24tbiigLiXpmZWKwmcNDDs", "dateBefore": "2024-04-24T06:40:43Z", "Process": "storagenode"}
2024-05-01T06:40:43Z INFO lazyfilewalker.trash-cleanup-filewalker.subprocess Database started {"Process": "storagenode", "satelliteID": "12L9ZFwhzVpuEKMUNUqkaTLGzwY9G24tbiigLiXpmZWKwmcNDDs", "Process": "storagenode"}
2024-05-01T06:40:43Z INFO lazyfilewalker.trash-cleanup-filewalker subprocess exited with status {"Process": "storagenode", "satelliteID": "12L9ZFwhzVpuEKMUNUqkaTLGzwY9G24tbiigLiXpmZWKwmcNDDs", "status": 1, "error": "exit status 1"}
2024-05-01T06:40:43Z ERROR pieces:trash emptying trash failed {"Process": "storagenode", "error": "pieces error: lazyfilewalker: exit status 1", "errorVerbose": "pieces error: lazyfilewalker: exit status 1\n\tstorj.io/storj/storagenode/pieces/lazyfilewalker.(*process).run:85\n\tstorj.io/storj/storagenode/pieces/lazyfilewalker.(*Supervisor).WalkCleanupTrash:185\n\tstorj.io/storj/storagenode/pieces.(*Store).EmptyTrash:423\n\tstorj.io/storj/storagenode/pieces.(*TrashChore).Run.func1.1:84\n\tstorj.io/common/sync2.(*Workplace).Start.func1:89"}
The exit code resp. error message is different.
Despite it has also the nested folders:
/storage/trash/v4weeab67sbgvnbwd5z7tweqsqqun7qox2agpbxy44mqqaaaaaaa/2024-04-20/2024-04-20/2024-04-20/2024-04-20/2024-04-20/2a/
The first step to remove an entire folder is to clean up its contents. At a programming level operating systems do not provide an operation like “remove the entire folder”. Operating systems only provide operations “remove this single file” and “remove an empty directory”.
Maybe storage nodes just forgot about this last step…
Do you happen to have the logs from around the time the .trash-uses-day-dirs-indicator
file was created? It must have hit some error trying to migrate that satellite.
Fine then. But I don’t understand why you would expect empty trash date folders. According to @elek the idea is to simply delete the folders that are old enough:
That is the idea, but “deleting folders” still involves iterating over all files, and deleting still involves reading from and writing to each file’s inode (or the MFT, on NTFS). The performance benefit is just that we don’t have to iterate over all trashed files, only the ones in the day-folder we are deleting. If files are trashed at a mostly constant rate, and we keep 7 days of trash, then deleting each day-folder should take about 1/7 the time it would have taken to iterate over all trash files and check the creation time for each one.
But yes, there are clearly bugs with trash cleanup. Multiple ones. We’re going to be working on it.
Do you happen to have the logs from around the time the
.trash-uses-day-dirs-indicator
file was created?
Unfortunately not. Logs are gone.
but “deleting folders” still involves iterating over all files, and deleting still involves reading from and writing to each file’s inode (or the MFT, on NTFS).
I see. Well the plan sounded like performing something like rm -rf
on the date folder would do.
The only reason directories aren’t being deleted and we are currently in this mess is that filewalker seems to need to run for every trashed file and update the database so that a value can be displayed on a piechart.
What I propose we do to finally fix this situation:
Migrating to this should get rid of the logic that the current trash cleanup uses, which, to my non-programmer understanding, is to go over every single piece within every date-named (or otherwise) directory for every satellite, go and update the databases with usage/pieceID, so we can have a nice pie chart.
Now, in the case that SHTF and trash is restored: This could not be solved any simpler, even if you tried to come up with a simpler solution. There is a forum announcement that “SHHTF, we needed to restore pieces from trash. There is nothing you need to do to since we have migrated to a new method of calculating space and the dashboard will be updated the next time you hit refresh on it”.
My $0.02. Some will not like them, some will. Some will disagree, some will agree. That’s life, tomorrow is another day.
It’s used the same API as a dashboard…
It should be fixed, sure, I just do not have an information in which version.