Corrupt node after power outage

I searched 430MB of Logs and did not find a single Unrecoverable or FATAL error.
The only thing I found are some errors:

	Zeile   96657: 2024-09-12T17:45:02+02:00	ERROR	pieces	used-space-filewalker failed	{"Process": "storagenode", "Satellite ID": "12L9ZFwhzVpuEKMUNUqkaTLGzwY9G24tbiigLiXpmZWKwmcNDDs", "Lazy File Walker": true, "error": "lazyfilewalker: signal: killed", "errorVerbose": "lazyfilewalker: signal: killed\n\tstorj.io/storj/storagenode/pieces/lazyfilewalker.(*process).run:85\n\tstorj.io/storj/storagenode/pieces/lazyfilewalker.(*Supervisor).WalkAndComputeSpaceUsedBySatellite:133\n\tstorj.io/storj/storagenode/pieces.(*Store).WalkAndComputeSpaceUsedBySatellite:722\n\tstorj.io/storj/storagenode/pieces.(*CacheService).Run.func1:81\n\tgolang.org/x/sync/errgroup.(*Group).Go.func1:78"}
	Zeile   96658: 2024-09-12T17:45:02+02:00	ERROR	pieces	used-space-filewalker failed	{"Process": "storagenode", "Satellite ID": "12L9ZFwhzVpuEKMUNUqkaTLGzwY9G24tbiigLiXpmZWKwmcNDDs", "Lazy File Walker": false, "error": "filewalker: context canceled", "errorVerbose": "filewalker: context canceled\n\tstorj.io/storj/storagenode/pieces.(*FileWalker).WalkSatellitePieces:74\n\tstorj.io/storj/storagenode/pieces.(*FileWalker).WalkAndComputeSpaceUsedBySatellite:79\n\tstorj.io/storj/storagenode/pieces.(*Store).WalkAndComputeSpaceUsedBySatellite:731\n\tstorj.io/storj/storagenode/pieces.(*CacheService).Run.func1:81\n\tgolang.org/x/sync/errgroup.(*Group).Go.func1:78"}
	Zeile   96660: 2024-09-12T17:45:02+02:00	INFO	pieces	used-space-filewalker started	{"Process": "storagenode", "Satellite ID": "12EayRS2V1kEsWESU9QMRseFhdxYxKicsiFmxrsLZHeLUtdps3S"}
	Zeile   96661: 2024-09-12T17:45:02+02:00	INFO	lazyfilewalker.used-space-filewalker	starting subprocess	{"Process": "storagenode", "satelliteID": "12EayRS2V1kEsWESU9QMRseFhdxYxKicsiFmxrsLZHeLUtdps3S"}
	Zeile   96662: 2024-09-12T17:45:02+02:00	ERROR	lazyfilewalker.used-space-filewalker	failed to start subprocess	{"Process": "storagenode", "satelliteID": "12EayRS2V1kEsWESU9QMRseFhdxYxKicsiFmxrsLZHeLUtdps3S", "error": "context canceled"}
	Zeile   96663: 2024-09-12T17:45:02+02:00	ERROR	pieces	used-space-filewalker failed	{"Process": "storagenode", "Satellite ID": "12EayRS2V1kEsWESU9QMRseFhdxYxKicsiFmxrsLZHeLUtdps3S", "Lazy File Walker": true, "error": "lazyfilewalker: context canceled", "errorVerbose": "lazyfilewalker: context canceled\n\tstorj.io/storj/storagenode/pieces/lazyfilewalker.(*process).run:73\n\tstorj.io/storj/storagenode/pieces/lazyfilewalker.(*Supervisor).WalkAndComputeSpaceUsedBySatellite:133\n\tstorj.io/storj/storagenode/pieces.(*Store).WalkAndComputeSpaceUsedBySatellite:722\n\tstorj.io/storj/storagenode/pieces.(*CacheService).Run.func1:81\n\tgolang.org/x/sync/errgroup.(*Group).Go.func1:78"}
	Zeile   96665: 2024-09-12T17:45:03+02:00	ERROR	pieces	used-space-filewalker failed	{"Process": "storagenode", "Satellite ID": "12EayRS2V1kEsWESU9QMRseFhdxYxKicsiFmxrsLZHeLUtdps3S", "Lazy File Walker": false, "error": "filewalker: context canceled", "errorVerbose": "filewalker: context canceled\n\tstorj.io/storj/storagenode/pieces.(*FileWalker).WalkSatellitePieces:74\n\tstorj.io/storj/storagenode/pieces.(*FileWalker).WalkAndComputeSpaceUsedBySatellite:79\n\tstorj.io/storj/storagenode/pieces.(*Store).WalkAndComputeSpaceUsedBySatellite:731\n\tstorj.io/storj/storagenode/pieces.(*CacheService).Run.func1:81\n\tgolang.org/x/sync/errgroup.(*Group).Go.func1:78"}
	Zeile   96667: 2024-09-12T17:45:03+02:00	INFO	pieces	used-space-filewalker started	{"Process": "storagenode", "Satellite ID": "1wFTAgs9DP5RSnCqKV1eLf6N9wtk4EAtmN5DpSxcs8EjT69tGE"}
	Zeile   96668: 2024-09-12T17:45:03+02:00	INFO	lazyfilewalker.used-space-filewalker	starting subprocess	{"Process": "storagenode", "satelliteID": "1wFTAgs9DP5RSnCqKV1eLf6N9wtk4EAtmN5DpSxcs8EjT69tGE"}
	Zeile   96669: 2024-09-12T17:45:03+02:00	ERROR	lazyfilewalker.used-space-filewalker	failed to start subprocess	{"Process": "storagenode", "satelliteID": "1wFTAgs9DP5RSnCqKV1eLf6N9wtk4EAtmN5DpSxcs8EjT69tGE", "error": "context canceled"}
	Zeile   96670: 2024-09-12T17:45:03+02:00	ERROR	pieces	used-space-filewalker failed	{"Process": "storagenode", "Satellite ID": "1wFTAgs9DP5RSnCqKV1eLf6N9wtk4EAtmN5DpSxcs8EjT69tGE", "Lazy File Walker": true, "error": "lazyfilewalker: context canceled", "errorVerbose": "lazyfilewalker: context canceled\n\tstorj.io/storj/storagenode/pieces/lazyfilewalker.(*process).run:73\n\tstorj.io/storj/storagenode/pieces/lazyfilewalker.(*Supervisor).WalkAndComputeSpaceUsedBySatellite:133\n\tstorj.io/storj/storagenode/pieces.(*Store).WalkAndComputeSpaceUsedBySatellite:722\n\tstorj.io/storj/storagenode/pieces.(*CacheService).Run.func1:81\n\tgolang.org/x/sync/errgroup.(*Group).Go.func1:78"}
	Zeile   96671: 2024-09-12T17:45:03+02:00	ERROR	pieces	used-space-filewalker failed	{"Process": "storagenode", "Satellite ID": "1wFTAgs9DP5RSnCqKV1eLf6N9wtk4EAtmN5DpSxcs8EjT69tGE", "Lazy File Walker": false, "error": "filewalker: context canceled", "errorVerbose": "filewalker: context canceled\n\tstorj.io/storj/storagenode/pieces.(*FileWalker).WalkSatellitePieces:74\n\tstorj.io/storj/storagenode/pieces.(*FileWalker).WalkAndComputeSpaceUsedBySatellite:79\n\tstorj.io/storj/storagenode/pieces.(*Store).WalkAndComputeSpaceUsedBySatellite:731\n\tstorj.io/storj/storagenode/pieces.(*CacheService).Run.func1:81\n\tgolang.org/x/sync/errgroup.(*Group).Go.func1:78"}
	Zeile   96695: 2024-09-12T17:45:20+02:00	INFO	pieces	used-space-filewalker started	{"Process": "storagenode", "Satellite ID": "121RTSDpyNZVcEU84Ticf2L1ntiuUimbWgfATz21tuvgk3vzoA6"}
	Zeile   96696: 2024-09-12T17:45:20+02:00	INFO	lazyfilewalker.used-space-filewalker	starting subprocess	{"Process": "storagenode", "satelliteID": "121RTSDpyNZVcEU84Ticf2L1ntiuUimbWgfATz21tuvgk3vzoA6"}
	Zeile   96699: 2024-09-12T17:45:20+02:00	INFO	lazyfilewalker.used-space-filewalker	subprocess started	{"Process": "storagenode", "satelliteID": "121RTSDpyNZVcEU84Ticf2L1ntiuUimbWgfATz21tuvgk3vzoA6"}
	Zeile   96728: 2024-09-12T17:45:20+02:00	INFO	lazyfilewalker.used-space-filewalker.subprocess	Database started	{"Process": "storagenode", "satelliteID": "121RTSDpyNZVcEU84Ticf2L1ntiuUimbWgfATz21tuvgk3vzoA6", "Process": "storagenode"}
	Zeile   99865: 2024-09-12T17:50:21+02:00	INFO	lazyfilewalker.used-space-filewalker	subprocess finished successfully	{"Process": "storagenode", "satelliteID": "121RTSDpyNZVcEU84Ticf2L1ntiuUimbWgfATz21tuvgk3vzoA6"}
	Zeile   99866: 2024-09-12T17:50:21+02:00	INFO	pieces	used-space-filewalker completed	{"Process": "storagenode", "Satellite ID": "121RTSDpyNZVcEU84Ticf2L1ntiuUimbWgfATz21tuvgk3vzoA6", "Lazy File Walker": true, "Total Pieces Size": 88874275584, "Total Pieces Content Size": 88717464832}
	Zeile   99867: 2024-09-12T17:50:21+02:00	INFO	pieces	used-space-filewalker started	{"Process": "storagenode", "Satellite ID": "12L9ZFwhzVpuEKMUNUqkaTLGzwY9G24tbiigLiXpmZWKwmcNDDs"}
	Zeile   99868: 2024-09-12T17:50:21+02:00	INFO	lazyfilewalker.used-space-filewalker	starting subprocess	{"Process": "storagenode", "satelliteID": "12L9ZFwhzVpuEKMUNUqkaTLGzwY9G24tbiigLiXpmZWKwmcNDDs"}
	Zeile   99869: 2024-09-12T17:50:21+02:00	INFO	lazyfilewalker.used-space-filewalker	subprocess started	{"Process": "storagenode", "satelliteID": "12L9ZFwhzVpuEKMUNUqkaTLGzwY9G24tbiigLiXpmZWKwmcNDDs"}
	Zeile   99872: 2024-09-12T17:50:21+02:00	INFO	lazyfilewalker.used-space-filewalker.subprocess	Database started	{"Process": "storagenode", "satelliteID": "12L9ZFwhzVpuEKMUNUqkaTLGzwY9G24tbiigLiXpmZWKwmcNDDs", "Process": "storagenode"}
	Zeile  128924: 2024-09-12T19:58:38+02:00	INFO	lazyfilewalker.used-space-filewalker	subprocess finished successfully	{"Process": "storagenode", "satelliteID": "12L9ZFwhzVpuEKMUNUqkaTLGzwY9G24tbiigLiXpmZWKwmcNDDs"}
	Zeile  128925: 2024-09-12T19:58:38+02:00	INFO	pieces	used-space-filewalker completed	{"Process": "storagenode", "Satellite ID": "12L9ZFwhzVpuEKMUNUqkaTLGzwY9G24tbiigLiXpmZWKwmcNDDs", "Lazy File Walker": true, "Total Pieces Size": 663375932416, "Total Pieces Content Size": 662568493568}
	Zeile  128926: 2024-09-12T19:58:38+02:00	INFO	pieces	used-space-filewalker started	{"Process": "storagenode", "Satellite ID": "12EayRS2V1kEsWESU9QMRseFhdxYxKicsiFmxrsLZHeLUtdps3S"}
	Zeile  128927: 2024-09-12T19:58:38+02:00	INFO	lazyfilewalker.used-space-filewalker	starting subprocess	{"Process": "storagenode", "satelliteID": "12EayRS2V1kEsWESU9QMRseFhdxYxKicsiFmxrsLZHeLUtdps3S"}
	Zeile  128928: 2024-09-12T19:58:38+02:00	INFO	lazyfilewalker.used-space-filewalker	subprocess started	{"Process": "storagenode", "satelliteID": "12EayRS2V1kEsWESU9QMRseFhdxYxKicsiFmxrsLZHeLUtdps3S"}
	Zeile  128930: 2024-09-12T19:58:39+02:00	INFO	lazyfilewalker.used-space-filewalker.subprocess	Database started	{"Process": "storagenode", "satelliteID": "12EayRS2V1kEsWESU9QMRseFhdxYxKicsiFmxrsLZHeLUtdps3S", "Process": "storagenode"}

I forgot to send my message days ago.

I think I just kill this node. Its not worth the effort for me. I dont think I can get it back running normally…
As I see TBs of data were deleted on all my other nodes. So there should be also no longer a lot of data on it.

this is mean that something external called to kill the node. All errors after that just a consequence of the killing processes.

I can see that some satellites are calculated already without errors, so maybe you can just let it finish. However, this would correct only the used space on the pie chart.
I would recommend to disable a lazy mode and enable the badger cache for this node. Even if the node is restarting, it continues calculate the used space and cache what’s already calculated, so with each attempt it should move forward.
You may add these options to your docker run command after the image name:

The next filewalker which is needed to clean an uncollected garbage is a retain process, it should move the garbage to the trash and it should be removed 7 days later automatically.

grep retain ~/Downloads/"Node (4).log" | grep -E "Prepar|Move"

PowerShell:

sls retain ~/Downloads/"Node (4).log" | sls "Prepar|Move"
1 Like

I think the kill was me manually restarting the node.
Thank you I will give those 2 parameters a try :slight_smile:

Then it changes everything. Please, do not kill it then and let it finish its work :slight_smile:

1 Like

There’s an old joke circling within my bubble:

A sysadmin goes to a restaurant and places an order. A few minutes later, some mafia guys sit down at the table next to him, discussing shady business.

In the middle of it, the sysadmin gets a call from a friend trying to debug a stuck process. After listening for a bit, he whispers into the phone, “You need to kill it.”

The mafia guys suddenly stop talking, look at him with respect, and one says, “Our guy!”

2 Likes