As the title suggests - is this an ERROR I should ignore or investigate further?
Could you please post the whole error?
Please post it between two lines with three backticks, like this:
```
error line here
```
2024-06-06T17:40:53-04:00 ERROR filewalker failed to get progress from database {"error": "gc_filewalker_progress_db: context canceled", "errorVerbose": "gc_filewalker_progress_db: context canceled\n\tstorj.io/storj/storagenode/storagenodedb.(*gcFilewalkerProgressDB).Get:47\n\tstorj.io/storj/storagenode/pieces.(*FileWalker).WalkSatellitePiecesToTrash:154\n\tstorj.io/storj/storagenode/pieces.(*Store).WalkSatellitePiecesToTrash:565\n\tstorj.io/storj/storagenode/retain.(*Service).retainPieces:369\n\tstorj.io/storj/storagenode/retain.(*Service).Run.func2:258\n\tgolang.org/x/sync/errgroup.(*Group).Go.func1:78"}
I’m also noticing that my filewalker isn’t running after this error, even after restarts
That command is just returning blank - filewalker just isn’t kicking off. So I double checked and my nodes are wildly inaccurate:
So it claims to be using 6TB but it’s closer to 10TB. I manually put a limit of 5TB to prevent my node from overfilling and letting filewalkers finish asap.
I have the exact same on a different 8TB node (same device).
It seems like the last filewalker to run successfully was May 30th,:
After this my log was rotated, hence there are no new lines, even despite restart.
EDIT:
I found the issue, it seems like if you disable lazy filewalker the above command just returns nothing/use file walker doesn’t start? I re-enabled lazy for both problematic drives and I can see in cat the filewalkers started
So this brings me back to the original post question if the failed to get progress from database
error can be safely ignored. Sorry for the long post
Yes, it’s a known issue:
Likely not. Did you check databases?
I checked my logs, but there is no error messages for my DB - or do you mean to go through the steps anyways on my DB?
Okay got it, I’ll let lazy finish to see if it fixes things (probably will) and the leave lazy off for future runs
I would suggest to do the reverse. To fix the issue with a discrepancy I would suggest to disable a lazy mode, set allocation below the usage and enable scan on startup. When the stat on the dashboard would be updated (you shouldn’t see a 100% disk activity - unfortunately without log messages I wouldn’t able to suggest anything better), then enable lazy and disable scan (unless you also have a third-party usage on the same disk), set the correct allocation, save the config and restart the node.
So after two days of it not finishing I dug in the logs and realised I missed errors. Everytime it tries to start I see the following:
2024-06-18T08:06:18-04:00 INFO lazyfilewalker.used-space-filewalker subprocess exited with status {"satelliteID": "1wFTAgs9DP5RSnCqKV1eLf6N9wtk4EAtmN5DpSxcs8EjT69tGE", "status": 1, "error": "exit status 1"}
2024-06-18T08:06:18-04:00 ERROR pieces failed to lazywalk space used by satellite {"error": "lazyfilewalker: exit status 1", "errorVerbose": "lazyfilewalker: exit status 1\n\tstorj.io/storj/storagenode/pieces/lazyfilewalker.(*process).run:85\n\tstorj.io/storj/storagenode/pieces/lazyfilewalker.(*Supervisor).WalkAndComputeSpaceUsedBySatellite:130\n\tstorj.io/storj/storagenode/pieces.(*Store).SpaceUsedTotalAndBySatellite:707\n\tstorj.io/storj/storagenode/pieces.(*CacheService).Run:58\n\tstorj.io/storj/private/lifecycle.(*Group).Run.func2.1:87\n\truntime/pprof.Do:51\n\tstorj.io/storj/private/lifecycle.(*Group).Run.func2:86\n\tgolang.org/x/sync/errgroup.(*Group).Go.func1:78", "Satellite ID": "1wFTAgs9DP5RSnCqKV1eLf6N9wtk4EAtmN5DpSxcs8EjT69tGE"}
2024-06-18T08:06:18-04:00 INFO lazyfilewalker.gc-filewalker subprocess exited with status {"satelliteID": "1wFTAgs9DP5RSnCqKV1eLf6N9wtk4EAtmN5DpSxcs8EjT69tGE", "status": 1, "error": "exit status 1"}
2024-06-18T08:06:18-04:00 ERROR pieces lazyfilewalker failed {"error": "lazyfilewalker: exit status 1", "errorVerbose": "lazyfilewalker: exit status 1\n\tstorj.io/storj/storagenode/pieces/lazyfilewalker.(*process).run:85\n\tstorj.io/storj/storagenode/pieces/lazyfilewalker.(*Supervisor).WalkSatellitePiecesToTrash:160\n\tstorj.io/storj/storagenode/pieces.(*Store).WalkSatellitePiecesToTrash:561\n\tstorj.io/storj/storagenode/retain.(*Service).retainPieces:373\n\tstorj.io/storj/storagenode/retain.(*Service).Run.func2:259\n\tgolang.org/x/sync/errgroup.(*Group).Go.func1:78"}
2024-06-18T08:06:18-04:00 ERROR filewalker failed to get progress from database
2024-06-18T08:06:18-04:00 ERROR lazyfilewalker.used-space-filewalker failed to start subprocess {"satelliteID": "121RTSDpyNZVcEU84Ticf2L1ntiuUimbWgfATz21tuvgk3vzoA6", "error": "context canceled"}
2024-06-18T08:06:18-04:00 ERROR pieces failed to lazywalk space used by satellite {"error": "lazyfilewalker: context canceled", "errorVerbose": "lazyfilewalker: context canceled\n\tstorj.io/storj/storagenode/pieces/lazyfilewalker.(*process).run:73\n\tstorj.io/storj/storagenode/pieces/lazyfilewalker.(*Supervisor).WalkAndComputeSpaceUsedBySatellite:130\n\tstorj.io/storj/storagenode/pieces.(*Store).SpaceUsedTotalAndBySatellite:707\n\tstorj.io/storj/storagenode/pieces.(*CacheService).Run:58\n\tstorj.io/storj/private/lifecycle.(*Group).Run.func2.1:87\n\truntime/pprof.Do:51\n\tstorj.io/storj/private/lifecycle.(*Group).Run.func2:86\n\tgolang.org/x/sync/errgroup.(*Group).Go.func1:78", "Satellite ID": "121RTSDpyNZVcEU84Ticf2L1ntiuUimbWgfATz21tuvgk3vzoA6"}
2024-06-18T08:06:18-04:00 ERROR retain retain pieces failed {"cachePath": "C:\\Program Files\\Storj2\\Storage Node/retain", "error": "retain: filewalker: context canceled", "errorVerbose": "retain: filewalker: context canceled\n\tstorj.io/storj/storagenode/pieces.(*FileWalker).WalkSatellitePieces:74\n\tstorj.io/storj/storagenode/pieces.(*FileWalker).WalkSatellitePiecesToTrash:181\n\tstorj.io/storj/storagenode/pieces.(*Store).WalkSatellitePiecesToTrash:568\n\tstorj.io/storj/storagenode/retain.(*Service).retainPieces:373\n\tstorj.io/storj/storagenode/retain.(*Service).Run.func2:259\n\tgolang.org/x/sync/errgroup.(*Group).Go.func1:78"}
2024-06-18T08:06:18-04:00 ERROR lazyfilewalker.used-space-filewalker failed to start subprocess {"satelliteID": "12EayRS2V1kEsWESU9QMRseFhdxYxKicsiFmxrsLZHeLUtdps3S", "error": "context canceled"}
2024-06-18T08:06:18-04:00 ERROR pieces failed to lazywalk space used by satellite {"error": "lazyfilewalker: context canceled", "errorVerbose": "lazyfilewalker: context canceled\n\tstorj.io/storj/storagenode/pieces/lazyfilewalker.(*process).run:73\n\tstorj.io/storj/storagenode/pieces/lazyfilewalker.(*Supervisor).WalkAndComputeSpaceUsedBySatellite:130\n\tstorj.io/storj/storagenode/pieces.(*Store).SpaceUsedTotalAndBySatellite:707\n\tstorj.io/storj/storagenode/pieces.(*CacheService).Run:58\n\tstorj.io/storj/private/lifecycle.(*Group).Run.func2.1:87\n\truntime/pprof.Do:51\n\tstorj.io/storj/private/lifecycle.(*Group).Run.func2:86\n\tgolang.org/x/sync/errgroup.(*Group).Go.func1:78", "Satellite ID": "12EayRS2V1kEsWESU9QMRseFhdxYxKicsiFmxrsLZHeLUtdps3S"}
2024-06-18T08:06:18-04:00 ERROR lazyfilewalker.used-space-filewalker failed to start subprocess {"satelliteID": "12L9ZFwhzVpuEKMUNUqkaTLGzwY9G24tbiigLiXpmZWKwmcNDDs", "error": "context canceled"}
2024-06-18T08:06:18-04:00 ERROR pieces failed to lazywalk space used by satellite {"error": "lazyfilewalker: context canceled", "errorVerbose": "lazyfilewalker: context canceled\n\tstorj.io/storj/storagenode/pieces/lazyfilewalker.(*process).run:73\n\tstorj.io/storj/storagenode/pieces/lazyfilewalker.(*Supervisor).WalkAndComputeSpaceUsedBySatellite:130\n\tstorj.io/storj/storagenode/pieces.(*Store).SpaceUsedTotalAndBySatellite:707\n\tstorj.io/storj/storagenode/pieces.(*CacheService).Run:58\n\tstorj.io/storj/private/lifecycle.(*Group).Run.func2.1:87\n\truntime/pprof.Do:51\n\tstorj.io/storj/private/lifecycle.(*Group).Run.func2:86\n\tgolang.org/x/sync/errgroup.(*Group).Go.func1:78", "Satellite ID": "12L9ZFwhzVpuEKMUNUqkaTLGzwY9G24tbiigLiXpmZWKwmcNDDs"}
2024-06-18T08:06:18-04:00 ERROR piecestore:cache error getting current used space: {"error": "filewalker: context canceled; filewalker: context canceled; filewalker: context canceled; filewalker: context canceled", "errorVerbose": "group:\n--- filewalker: context canceled\n\tstorj.io/storj/storagenode/pieces.(*FileWalker).WalkSatellitePieces:74\n\tstorj.io/storj/storagenode/pieces.(*FileWalker).WalkAndComputeSpaceUsedBySatellite:79\n\tstorj.io/storj/storagenode/pieces.(*Store).SpaceUsedTotalAndBySatellite:716\n\tstorj.io/storj/storagenode/pieces.(*CacheService).Run:58\n\tstorj.io/storj/private/lifecycle.(*Group).Run.func2.1:87\n\truntime/pprof.Do:51\n\tstorj.io/storj/private/lifecycle.(*Group).Run.func2:86\n\tgolang.org/x/sync/errgroup.(*Group).Go.func1:78\n--- filewalker: context canceled\n\tstorj.io/storj/storagenode/pieces.(*FileWalker).WalkSatellitePieces:74\n\tstorj.io/storj/storagenode/pieces.(*FileWalker).WalkAndComputeSpaceUsedBySatellite:79\n\tstorj.io/storj/storagenode/pieces.(*Store).SpaceUsedTotalAndBySatellite:716\n\tstorj.io/storj/storagenode/pieces.(*CacheService).Run:58\n\tstorj.io/storj/private/lifecycle.(*Group).Run.func2.1:87\n\truntime/pprof.Do:51\n\tstorj.io/storj/private/lifecycle.(*Group).Run.func2:86\n\tgolang.org/x/sync/errgroup.(*Group).Go.func1:78\n--- filewalker: context canceled\n\tstorj.io/storj/storagenode/pieces.(*FileWalker).WalkSatellitePieces:74\n\tstorj.io/storj/storagenode/pieces.(*FileWalker).WalkAndComputeSpaceUsedBySatellite:79\n\tstorj.io/storj/storagenode/pieces.(*Store).SpaceUsedTotalAndBySatellite:716\n\tstorj.io/storj/storagenode/pieces.(*CacheService).Run:58\n\tstorj.io/storj/private/lifecycle.(*Group).Run.func2.1:87\n\truntime/pprof.Do:51\n\tstorj.io/storj/private/lifecycle.(*Group).Run.func2:86\n\tgolang.org/x/sync/errgroup.(*Group).Go.func1:78\n--- filewalker: context canceled\n\tstorj.io/storj/storagenode/pieces.(*FileWalker).WalkSatellitePieces:74\n\tstorj.io/storj/storagenode/pieces.(*FileWalker).WalkAndComputeSpaceUsedBySatellite:79\n\tstorj.io/storj/storagenode/pieces.(*Store).SpaceUsedTotalAndBySatellite:716\n\tstorj.io/storj/storagenode/pieces.(*CacheService).Run:58\n\tstorj.io/storj/private/lifecycle.(*Group).Run.func2.1:87\n\truntime/pprof.Do:51\n\tstorj.io/storj/private/lifecycle.(*Group).Run.func2:86\n\tgolang.org/x/sync/errgroup.(*Group).Go.func1:78"}
It seems like something is broken and it can’t start it. I will check the DB’s in the meantime
Not sure what I’m doing wrong:
And trying to run it manually I get:
P.S.
While doing this, I realised that none of my DB’s are actually on my SSD, and instead are on the same drive as the node data. So now I want to move them out… but can’t find a good windows guide on it? @Alexey
Thanks for that, the DB’s got moved, everything is running smoothly. HOWEVER - my 2 big nodes still cannot start the filewalker (so that remains unchanged) (error pasted above).
Checking the DB’s individually all seem ok - but I didn’t do all of them as that’s crazy tedious
context canceled
usually denotes slower disks. Can you confirm if the disks are not SMR ? You could check their model number.
Maybe a smarter person can double-check for me please?
20TB: ST16000NM001G-2KK103 (I’m 99% sure it’s cmr)
8TB: ST8000VN004-2M2101 (Not 100% sure which one it is)
@nerdatwork
I am not sure about this as all Exos drives are CMR. You can check the list here
As per the above list Ironwolf series is CMR too
where did you get the smr part from?
Google’s generative AI search
Err I don’t trust AI for this stuff at all
You duplicated the command in the one line. Try to paste it only once.
My filewalker is not even starting on some nodes running the following commands:
pieces.enable-lazy-filewalker: false
storage2.piece-scan-on-startup: true
I have over 50tb thats in “trash” but its however deleted but since lazyfilewalker is not able to clear the update trash value and the normal filewalker does not start im loosing alot of storage. What to do? And when will you fix these issues we have?