Hello,
I am running the node on a RPi, is there a suitable command on Linux that I could run ? The disk discrepancy still exists.
Regards, Ron
Hello,
I am running the node on a RPi, is there a suitable command on Linux that I could run ? The disk discrepancy still exists.
Regards, Ron
Are y[quote=“Ring_Zero, post:1161, topic:24715, full:true”]
I am running the node on a RPi, is there a suitable command on Linux that I could run ? The disk discrepancy still exists.
[/quote]
So, high level, you need to restart the node and let it finish the used space filewalker. for each of four satellites. Which could take many days on a slow node.
Are you running straight docker, or docker compose, or something else?
i’m a docker compose person, To look at the logs try “docker logs storagenode | grep used”
if the most recent line in the file is “used-space-filewalker started” then it’s still running,
if the most recent line is “used-space-filewalker completed”, and you have four entries of it (one for each satellite), then it’s finished. (you can look at timestamps to figure out how long it ran)
If it has finished, and it’s still reporting inaccurate, then something is weird. probably fixable by bringing the node down and backup to kick off another round of filewalkers.
if it’s still running, let it finish.
You could also try looking in your /storage/temp and /storage/garbage directories. There shouldn’t be anything there any more (used in older versions of storj). I had one node which had like 10,000 small files in the temp folder, probably after some sort of crash.
Then perhaps you have a lazy filewalker failing, please check:
sudo docker logs storagenode 2>&1 | grep error | grep filewalker | tail
if so, then you need to disable the lazy mode, enable the badger cache and restart the node.
You may add these options after the image name to your docker run
command:
You need to stop and remove the container, then run it back using your docker run
command with all your parameters include changed ones to apply the change.
Then monitor when it would finish:
sudo docker logs storagenode -f 2>&1 | grep "\sused-space" | grep -E "started|finished"
You also should not have any untrusted satellites: How To Forget Untrusted Satellites
So finally after the automatic update logfiles went to the file now and here’s the logs. From what I can see it didn’t work once, also timewise earlier → but then it looks as it completed nicely.
What do you think?
grep "\used-space" /volume1/Storj2/node2new.log | grep -E "started|completed" | tail
2024-09-05T16:18:34Z INFO lazyfilewalker.used-space-filewalker.subprocess Database started {"Process": "storagenode", "satelliteID": "12L9ZFwhzVpuEKMUNUqkaTLGzwY9G24tbiigLiXpmZWKwmcNDDs", "Process": "storagenode"}
2024-09-05T16:37:46Z INFO pieces used-space-filewalker completed {"Process": "storagenode", "Satellite ID": "12L9ZFwhzVpuEKMUNUqkaTLGzwY9G24tbiigLiXpmZWKwmcNDDs", "Lazy File Walker": true, "Total Pieces Size": 2113405133568, "Total Pieces Content Size": 2111172414208}
2024-09-05T16:37:46Z INFO pieces used-space-filewalker started {"Process": "storagenode", "Satellite ID": "118UWpMCHzs6CvSgWd9BfFVjw5K9pZbJjkfZJexMtSkmKxvvAW"}
2024-09-05T16:37:46Z INFO lazyfilewalker.used-space-filewalker subprocess started {"Process": "storagenode", "satelliteID": "118UWpMCHzs6CvSgWd9BfFVjw5K9pZbJjkfZJexMtSkmKxvvAW"}
2024-09-05T16:37:46Z INFO lazyfilewalker.used-space-filewalker.subprocess Database started {"Process": "storagenode", "satelliteID": "118UWpMCHzs6CvSgWd9BfFVjw5K9pZbJjkfZJexMtSkmKxvvAW", "Process": "storagenode"}
2024-09-05T16:37:46Z INFO pieces used-space-filewalker completed {"Process": "storagenode", "Satellite ID": "118UWpMCHzs6CvSgWd9BfFVjw5K9pZbJjkfZJexMtSkmKxvvAW", "Lazy File Walker": true, "Total Pieces Size": 224811008, "Total Pieces Content Size": 224761344}
2024-09-05T16:37:46Z INFO pieces used-space-filewalker started {"Process": "storagenode", "Satellite ID": "1wFTAgs9DP5RSnCqKV1eLf6N9wtk4EAtmN5DpSxcs8EjT69tGE"}
2024-09-05T16:37:46Z INFO lazyfilewalker.used-space-filewalker subprocess started {"Process": "storagenode", "satelliteID": "1wFTAgs9DP5RSnCqKV1eLf6N9wtk4EAtmN5DpSxcs8EjT69tGE"}
2024-09-05T16:37:46Z INFO lazyfilewalker.used-space-filewalker.subprocess Database started {"Process": "storagenode", "satelliteID": "1wFTAgs9DP5RSnCqKV1eLf6N9wtk4EAtmN5DpSxcs8EjT69tGE", "Process": "storagenode"}
2024-09-05T16:38:28Z INFO pieces used-space-filewalker completed {"Process": "storagenode", "Satellite ID": "1wFTAgs9DP5RSnCqKV1eLf6N9wtk4EAtmN5DpSxcs8EjT69tGE", "Lazy File Walker": true, "Total Pieces Size": 1262812728832, "Total Pieces Content Size": 1262605037056}
and for errors:
grep error /volume1/Storj2/node2new.log | grep "used-space" | tail
2024-09-05T13:36:41Z ERROR pieces used-space-filewalker failed {"Process": "storagenode", "Satellite ID": "121RTSDpyNZVcEU84Ticf2L1ntiuUimbWgfATz21tuvgk3vzoA6", "Lazy File Walker": false, "error": "filewalker: context canceled", "errorVerbose": "filewalker: context canceled\n\tstorj.io/storj/storagenode/pieces.(*FileWalker).WalkSatellitePieces:74\n\tstorj.io/storj/storagenode/pieces.(*FileWalker).WalkAndComputeSpaceUsedBySatellite:79\n\tstorj.io/storj/storagenode/pieces.(*Store).WalkAndComputeSpaceUsedBySatellite:731\n\tstorj.io/storj/storagenode/pieces.(*CacheService).Run.func1:81\n\tgolang.org/x/sync/errgroup.(*Group).Go.func1:78"}
2024-09-05T13:36:41Z ERROR lazyfilewalker.used-space-filewalker failed to start subprocess {"Process": "storagenode", "satelliteID": "12L9ZFwhzVpuEKMUNUqkaTLGzwY9G24tbiigLiXpmZWKwmcNDDs", "error": "context canceled"}
2024-09-05T13:36:41Z ERROR pieces used-space-filewalker failed {"Process": "storagenode", "Satellite ID": "12L9ZFwhzVpuEKMUNUqkaTLGzwY9G24tbiigLiXpmZWKwmcNDDs", "Lazy File Walker": true, "error": "lazyfilewalker: context canceled", "errorVerbose": "lazyfilewalker: context canceled\n\tstorj.io/storj/storagenode/pieces/lazyfilewalker.(*process).run:73\n\tstorj.io/storj/storagenode/pieces/lazyfilewalker.(*Supervisor).WalkAndComputeSpaceUsedBySatellite:133\n\tstorj.io/storj/storagenode/pieces.(*Store).WalkAndComputeSpaceUsedBySatellite:722\n\tstorj.io/storj/storagenode/pieces.(*CacheService).Run.func1:81\n\tgolang.org/x/sync/errgroup.(*Group).Go.func1:78"}
2024-09-05T13:36:41Z ERROR pieces used-space-filewalker failed {"Process": "storagenode", "Satellite ID": "12L9ZFwhzVpuEKMUNUqkaTLGzwY9G24tbiigLiXpmZWKwmcNDDs", "Lazy File Walker": false, "error": "filewalker: context canceled", "errorVerbose": "filewalker: context canceled\n\tstorj.io/storj/storagenode/pieces.(*FileWalker).WalkSatellitePieces:74\n\tstorj.io/storj/storagenode/pieces.(*FileWalker).WalkAndComputeSpaceUsedBySatellite:79\n\tstorj.io/storj/storagenode/pieces.(*Store).WalkAndComputeSpaceUsedBySatellite:731\n\tstorj.io/storj/storagenode/pieces.(*CacheService).Run.func1:81\n\tgolang.org/x/sync/errgroup.(*Group).Go.func1:78"}
2024-09-05T13:36:41Z ERROR lazyfilewalker.used-space-filewalker failed to start subprocess {"Process": "storagenode", "satelliteID": "118UWpMCHzs6CvSgWd9BfFVjw5K9pZbJjkfZJexMtSkmKxvvAW", "error": "context canceled"}
2024-09-05T13:36:41Z ERROR pieces used-space-filewalker failed {"Process": "storagenode", "Satellite ID": "118UWpMCHzs6CvSgWd9BfFVjw5K9pZbJjkfZJexMtSkmKxvvAW", "Lazy File Walker": true, "error": "lazyfilewalker: context canceled", "errorVerbose": "lazyfilewalker: context canceled\n\tstorj.io/storj/storagenode/pieces/lazyfilewalker.(*process).run:73\n\tstorj.io/storj/storagenode/pieces/lazyfilewalker.(*Supervisor).WalkAndComputeSpaceUsedBySatellite:133\n\tstorj.io/storj/storagenode/pieces.(*Store).WalkAndComputeSpaceUsedBySatellite:722\n\tstorj.io/storj/storagenode/pieces.(*CacheService).Run.func1:81\n\tgolang.org/x/sync/errgroup.(*Group).Go.func1:78"}
2024-09-05T13:36:41Z ERROR pieces used-space-filewalker failed {"Process": "storagenode", "Satellite ID": "118UWpMCHzs6CvSgWd9BfFVjw5K9pZbJjkfZJexMtSkmKxvvAW", "Lazy File Walker": false, "error": "filewalker: context canceled", "errorVerbose": "filewalker: context canceled\n\tstorj.io/storj/storagenode/pieces.(*FileWalker).WalkSatellitePieces:74\n\tstorj.io/storj/storagenode/pieces.(*FileWalker).WalkAndComputeSpaceUsedBySatellite:79\n\tstorj.io/storj/storagenode/pieces.(*Store).WalkAndComputeSpaceUsedBySatellite:731\n\tstorj.io/storj/storagenode/pieces.(*CacheService).Run.func1:81\n\tgolang.org/x/sync/errgroup.(*Group).Go.func1:78"}
2024-09-05T13:36:41Z ERROR lazyfilewalker.used-space-filewalker failed to start subprocess {"Process": "storagenode", "satelliteID": "1wFTAgs9DP5RSnCqKV1eLf6N9wtk4EAtmN5DpSxcs8EjT69tGE", "error": "context canceled"}
2024-09-05T13:36:41Z ERROR pieces used-space-filewalker failed {"Process": "storagenode", "Satellite ID": "1wFTAgs9DP5RSnCqKV1eLf6N9wtk4EAtmN5DpSxcs8EjT69tGE", "Lazy File Walker": true, "error": "lazyfilewalker: context canceled", "errorVerbose": "lazyfilewalker: context canceled\n\tstorj.io/storj/storagenode/pieces/lazyfilewalker.(*process).run:73\n\tstorj.io/storj/storagenode/pieces/lazyfilewalker.(*Supervisor).WalkAndComputeSpaceUsedBySatellite:133\n\tstorj.io/storj/storagenode/pieces.(*Store).WalkAndComputeSpaceUsedBySatellite:722\n\tstorj.io/storj/storagenode/pieces.(*CacheService).Run.func1:81\n\tgolang.org/x/sync/errgroup.(*Group).Go.func1:78"}
2024-09-05T13:36:41Z ERROR pieces used-space-filewalker failed {"Process": "storagenode", "Satellite ID": "1wFTAgs9DP5RSnCqKV1eLf6N9wtk4EAtmN5DpSxcs8EjT69tGE", "Lazy File Walker": false, "error": "filewalker: context canceled", "errorVerbose": "filewalker: context canceled\n\tstorj.io/storj/storagenode/pieces.(*FileWalker).WalkSatellitePieces:74\n\tstorj.io/storj/storagenode/pieces.(*FileWalker).WalkAndComputeSpaceUsedBySatellite:79\n\tstorj.io/storj/storagenode/pieces.(*Store).WalkAndComputeSpaceUsedBySatellite:731\n\tstorj.io/storj/storagenode/pieces.(*CacheService).Run.func1:81\n\tgolang.org/x/sync/errgroup.(*Group).Go.func1:78"}
it should be
grep "\sused-space" /volume1/Storj2/node2new.log | grep -E "started|completed" | tail
The first \s
is important - it allows to filter only useful messages, not all subprocesses.
When you see errors like these, it means that Filewalker was unable to calculate the space due to a timeout waiting for a response from the disk.
However, since you have completed them 3 hours later, I would assume that you restarted the node, and now it’s finishing the calculations. As far as I can see, two more satellites to go.
Perfect thanks for the clarification, from my side this now looks all perfect:
grep "\sused-space" /volume1/Storj2/node2new.log | grep -E "started|completed" | tail
2024-09-05T13:36:47Z INFO pieces used-space-filewalker started {"Process": "storagenode", "Satellite ID": "12EayRS2V1kEsWESU9QMRseFhdxYxKicsiFmxrsLZHeLUtdps3S"}
2024-09-05T16:14:22Z INFO pieces used-space-filewalker completed {"Process": "storagenode", "Satellite ID": "12EayRS2V1kEsWESU9QMRseFhdxYxKicsiFmxrsLZHeLUtdps3S", "Lazy File Walker": true, "Total Pieces Size": 5990491520170, "Total Pieces Content Size": 5973353694890}
2024-09-05T16:14:22Z INFO pieces used-space-filewalker started {"Process": "storagenode", "Satellite ID": "121RTSDpyNZVcEU84Ticf2L1ntiuUimbWgfATz21tuvgk3vzoA6"}
2024-09-05T16:18:34Z INFO pieces used-space-filewalker completed {"Process": "storagenode", "Satellite ID": "121RTSDpyNZVcEU84Ticf2L1ntiuUimbWgfATz21tuvgk3vzoA6", "Lazy File Walker": true, "Total Pieces Size": 274433362944, "Total Pieces Content Size": 273879636480}
2024-09-05T16:18:34Z INFO pieces used-space-filewalker started {"Process": "storagenode", "Satellite ID": "12L9ZFwhzVpuEKMUNUqkaTLGzwY9G24tbiigLiXpmZWKwmcNDDs"}
2024-09-05T16:37:46Z INFO pieces used-space-filewalker completed {"Process": "storagenode", "Satellite ID": "12L9ZFwhzVpuEKMUNUqkaTLGzwY9G24tbiigLiXpmZWKwmcNDDs", "Lazy File Walker": true, "Total Pieces Size": 2113405133568, "Total Pieces Content Size": 2111172414208}
2024-09-05T16:37:46Z INFO pieces used-space-filewalker started {"Process": "storagenode", "Satellite ID": "118UWpMCHzs6CvSgWd9BfFVjw5K9pZbJjkfZJexMtSkmKxvvAW"}
2024-09-05T16:37:46Z INFO pieces used-space-filewalker completed {"Process": "storagenode", "Satellite ID": "118UWpMCHzs6CvSgWd9BfFVjw5K9pZbJjkfZJexMtSkmKxvvAW", "Lazy File Walker": true, "Total Pieces Size": 224811008, "Total Pieces Content Size": 224761344}
2024-09-05T16:37:46Z INFO pieces used-space-filewalker started {"Process": "storagenode", "Satellite ID": "1wFTAgs9DP5RSnCqKV1eLf6N9wtk4EAtmN5DpSxcs8EjT69tGE"}
2024-09-05T16:38:28Z INFO pieces used-space-filewalker completed {"Process": "storagenode", "Satellite ID": "1wFTAgs9DP5RSnCqKV1eLf6N9wtk4EAtmN5DpSxcs8EjT69tGE", "Lazy File Walker": true, "Total Pieces Size": 1262812728832, "Total Pieces Content Size": 1262605037056}
I think somehow this node kind of fixed it over time with updates etc. For the Synology I think the trick was really to ‘Pin all Btrfs metadata to SSD cache’ … After that just a lot of catching up / deleting / etc. and now we’re fine. Also the tests are not happening anymore that helps for sure as well.
Thanks for the great support @Alexey
Why do you have 5 satellites?
The 118 Stefan Benten it’s decomisioned as far as I know.
https://forum.storj.io/t/how-to-forget-untrusted-satellites/23821?u=snorkel
https://forum.storj.io/t/satellite-info-address-id-blobs-folder-hex/17183?u=snorkel
Yeah good question - I also did the forget satelite command some weeks before, but it’s a very old node. Maybe it just runs through even though there’s no data any more for a long time?
You need to explicitly add this 118… satellite to the forget-satellite
command with a --force
flag.
Yeah I’ve done exactly that before, when you pointed me into that direction.
See logfiles here a bit further up:
Now that I read it again, I used his command:
docker exec -it storagenode ./storagenode forget-satellite --force 12rfG3sh9NCWiX3ivPjq2HtdLmbqCrvHVEzJubnzFzosMuawymB 12tRQrMTWUWwzwGh18i7Fqs67kmdhH9t6aToeiwbo5mfS2rUmo --config-dir config --identity-dir identity
So there were just two nodes in there to forget?
You again missed the 118 satellite, you just repeated for (I would assume) already removed satellites. The 118 much older and it is not included to the example command.
You need to execute
docker exec -it storagenode ./storagenode forget-satellite --force 118UWpMCHzs6CvSgWd9BfFVjw5K9pZbJjkfZJexMtSkmKxvvAW --config-dir config --identity-dir identity