Why is my node unable to rebuild it's databases?

Hello Forum. I have three Windows VMs each with a GUI node, with around 10TB in between them. The VMs lost connection to their StorJ storage, and have had it’s databases corrupted. They could not be saved with this trick, and have all been deleted.

The VMs C: drive (where databases are also loved) live on a local NVMe Drive. The StorJ Datafolder is on an ISCSI share on an external Synology box, with underlaying ext4 storage. The synology box has never experienced power outages, and has large SSDs presented as r/w caching, with avg sub 1ms response times and peaks at max 10ms.

I’ve just finished running chkdsk /r /f /x on the windows VM, showing no additional errors.

The nodes seems to be unable to recreate their databases, crashes every 15ish minutes and thus has filewalker running at all times. This pushes the storage array hard, and I can see my cache getting more and more filled each hour. The nodes themselves keeps ingesting data, does not see how large they are. I’ve limited the three nodes each to 1TB, but because they don’t know their size, they keep growing.

If it helps anything, I have no problems moving each of the nodes to a standalone disk. Either by moving the .vmdks, or by mounting disks to the VMs, and robocopying over node data

Just prior to writing this post, I’ve stopped the storagenode service on one of the nodes, deleted the log file and restarted the node. Below is some of the storagenode.log data, from it’s first ~30 minutes after clean log file. No hits on “FATAL”.

$content = Get-Content "C:\Program Files\Storj\Storage Node\storagenode.log" 
$content = Select-String -Pattern "lazy|restart|ERROR" | Select-String -NotMatch "ping|upload|download" 

2023-12-14T20:39:19+01:00	INFO	lazyfilewalker.used-space-filewalker	starting subprocess	{"satelliteID": "12tRQrMTWUWwzwGh18i7Fqs67kmdhH9t6aToeiwbo5mfS2rUmo"}
2023-12-14T20:39:19+01:00	INFO	lazyfilewalker.used-space-filewalker	subprocess started	{"satelliteID": "12tRQrMTWUWwzwGh18i7Fqs67kmdhH9t6aToeiwbo5mfS2rUmo"}
2023-12-14T20:39:19+01:00	ERROR	blobscache	piecesTotal < 0	{"piecesTotal": -277760}
2023-12-14T20:39:19+01:00	ERROR	blobscache	piecesContentSize < 0	{"piecesContentSize": -277248}
2023-12-14T20:39:19+01:00	ERROR	blobscache	satPiecesTotal < 0	{"satPiecesTotal": -277760}
2023-12-14T20:39:19+01:00	ERROR	blobscache	satPiecesContentSize < 0	{"satPiecesContentSize": -277248}
2023-12-14T20:39:19+01:00	ERROR	blobscache	piecesTotal < 0	{"piecesTotal": -253184}
2023-12-14T20:39:19+01:00	ERROR	blobscache	piecesContentSize < 0	{"piecesContentSize": -253184}
2023-12-14T20:39:19+01:00	ERROR	blobscache	satPiecesTotal < 0	{"satPiecesTotal": -253184}
2023-12-14T20:39:19+01:00	ERROR	blobscache	satPiecesContentSize < 0	{"satPiecesContentSize": -253184}
2023-12-14T20:39:19+01:00	ERROR	blobscache	piecesTotal < 0	{"piecesTotal": -539648}
2023-12-14T20:39:19+01:00	ERROR	blobscache	piecesContentSize < 0	{"piecesContentSize": -539136}
2023-12-14T20:39:19+01:00	ERROR	blobscache	satPiecesTotal < 0	{"satPiecesTotal": -539648}
2023-12-14T20:39:19+01:00	ERROR	blobscache	satPiecesContentSize < 0	{"satPiecesContentSize": -539136}
2023-12-14T20:39:19+01:00	ERROR	blobscache	piecesTotal < 0	{"piecesTotal": -663296}
2023-12-14T20:39:19+01:00	ERROR	blobscache	piecesContentSize < 0	{"piecesContentSize": -662784}
2023-12-14T20:39:19+01:00	ERROR	blobscache	satPiecesTotal < 0	{"satPiecesTotal": -663296}
2023-12-14T20:39:19+01:00	ERROR	blobscache	satPiecesContentSize < 0	{"satPiecesContentSize": -662784}
2023-12-14T20:39:19+01:00	INFO	lazyfilewalker.used-space-filewalker.subprocess	Database started	{"satelliteID": "12tRQrMTWUWwzwGh18i7Fqs67kmdhH9t6aToeiwbo5mfS2rUmo", "process": "storagenode"}
2023-12-14T20:39:19+01:00	INFO	lazyfilewalker.used-space-filewalker.subprocess	used-space-filewalker started	{"satelliteID": "12tRQrMTWUWwzwGh18i7Fqs67kmdhH9t6aToeiwbo5mfS2rUmo", "process": 
"storagenode"}
2023-12-14T20:39:21+01:00	INFO	lazyfilewalker.used-space-filewalker.subprocess	used-space-filewalker completed	{"satelliteID": "12tRQrMTWUWwzwGh18i7Fqs67kmdhH9t6aToeiwbo5mfS2rUmo", "process": 
"storagenode", "piecesTotal": 0, "piecesContentSize": 0}
2023-12-14T20:39:21+01:00	INFO	lazyfilewalker.used-space-filewalker	subprocess finished successfully	{"satelliteID": "12tRQrMTWUWwzwGh18i7Fqs67kmdhH9t6aToeiwbo5mfS2rUmo"}
2023-12-14T20:39:21+01:00	INFO	lazyfilewalker.used-space-filewalker	starting subprocess	{"satelliteID": "1wFTAgs9DP5RSnCqKV1eLf6N9wtk4EAtmN5DpSxcs8EjT69tGE"}
2023-12-14T20:39:21+01:00	INFO	lazyfilewalker.used-space-filewalker	subprocess started	{"satelliteID": "1wFTAgs9DP5RSnCqKV1eLf6N9wtk4EAtmN5DpSxcs8EjT69tGE"}
2023-12-14T20:39:21+01:00	INFO	lazyfilewalker.used-space-filewalker.subprocess	Database started	{"satelliteID": "1wFTAgs9DP5RSnCqKV1eLf6N9wtk4EAtmN5DpSxcs8EjT69tGE", "process": "storagenode"}
2023-12-14T20:39:21+01:00	INFO	lazyfilewalker.used-space-filewalker.subprocess	used-space-filewalker started	{"satelliteID": "1wFTAgs9DP5RSnCqKV1eLf6N9wtk4EAtmN5DpSxcs8EjT69tGE", "process": 
"storagenode"}
2023-12-14T20:42:32+01:00	ERROR	nodestats:cache	Get stats query failed	{"error": "nodestats: rpc: tcp connector failed: rpc: dial tcp 34.94.153.46:7777: connectex: A connection attempt failed because 
the connected party did not properly respond after a period of time, or established connection failed because connected host has failed to respond.; nodestats: rpc: tcp connector failed: rpc: dial tcp: 
lookup ap1.storj.io: no such host; nodestats: rpc: tcp connector failed: rpc: dial tcp: lookup us1.storj.io: no such host", "errorVerbose": "group:\n--- nodestats: rpc: tcp connector failed: rpc: dial 
tcp 34.94.153.46:7777: connectex: A connection attempt failed because the connected party did not properly respond after a period of time, or established connection failed because connected host has 
failed to respond.\n\tstorj.io/common/rpc.HybridConnector.DialContext.func1:190\n--- nodestats: rpc: tcp connector failed: rpc: dial tcp: lookup ap1.storj.io: no such 
host\n\tstorj.io/common/rpc.HybridConnector.DialContext.func1:190\n--- nodestats: rpc: tcp connector failed: rpc: dial tcp: lookup us1.storj.io: no such 
host\n\tstorj.io/common/rpc.HybridConnector.DialContext.func1:190"}
2023-12-14T20:42:32+01:00	ERROR	nodestats:cache	Get disk space usage query failed	{"error": "nodestats: rpc: tcp connector failed: rpc: dial tcp: lookup ap1.storj.io: no such host; nodestats: rpc: tcp 
connector failed: rpc: dial tcp: lookup us1.storj.io: no such host", "errorVerbose": "group:\n--- nodestats: rpc: tcp connector failed: rpc: dial tcp: lookup ap1.storj.io: no such 
host\n\tstorj.io/common/rpc.HybridConnector.DialContext.func1:190\n--- nodestats: rpc: tcp connector failed: rpc: dial tcp: lookup us1.storj.io: no such 
host\n\tstorj.io/common/rpc.HybridConnector.DialContext.func1:190"}
2023-12-14T20:46:20+01:00	INFO	lazyfilewalker.used-space-filewalker.subprocess	used-space-filewalker completed	{"satelliteID": "1wFTAgs9DP5RSnCqKV1eLf6N9wtk4EAtmN5DpSxcs8EjT69tGE", "process": 
"storagenode", "piecesTotal": 114118100736, "piecesContentSize": 114089065728}
2023-12-14T20:46:20+01:00	INFO	lazyfilewalker.used-space-filewalker	subprocess finished successfully	{"satelliteID": "1wFTAgs9DP5RSnCqKV1eLf6N9wtk4EAtmN5DpSxcs8EjT69tGE"}
2023-12-14T20:46:20+01:00	INFO	lazyfilewalker.used-space-filewalker	starting subprocess	{"satelliteID": "121RTSDpyNZVcEU84Ticf2L1ntiuUimbWgfATz21tuvgk3vzoA6"}
2023-12-14T20:46:20+01:00	INFO	lazyfilewalker.used-space-filewalker	subprocess started	{"satelliteID": "121RTSDpyNZVcEU84Ticf2L1ntiuUimbWgfATz21tuvgk3vzoA6"}
2023-12-14T20:46:20+01:00	INFO	lazyfilewalker.used-space-filewalker.subprocess	Database started	{"satelliteID": "121RTSDpyNZVcEU84Ticf2L1ntiuUimbWgfATz21tuvgk3vzoA6", "process": "storagenode"}
2023-12-14T20:46:20+01:00	INFO	lazyfilewalker.used-space-filewalker.subprocess	used-space-filewalker started	{"satelliteID": "121RTSDpyNZVcEU84Ticf2L1ntiuUimbWgfATz21tuvgk3vzoA6", "process": 
"storagenode"}
2023-12-14T20:52:46+01:00	INFO	lazyfilewalker.used-space-filewalker	subprocess exited with status	{"satelliteID": "121RTSDpyNZVcEU84Ticf2L1ntiuUimbWgfATz21tuvgk3vzoA6", "status": 1, "error": "exit 
status 1"}
2023-12-14T20:52:46+01:00	ERROR	pieces	failed to lazywalk space used by satellite	{"error": "lazyfilewalker: exit status 1", "errorVerbose": "lazyfilewalker: exit status 1\n\tstorj.io/storj/storagenode/p
ieces/lazyfilewalker.(*process).run:83\n\tstorj.io/storj/storagenode/pieces/lazyfilewalker.(*Supervisor).WalkAndComputeSpaceUsedBySatellite:105\n\tstorj.io/storj/storagenode/pieces.(*Store).SpaceUsedTota
lAndBySatellite:717\n\tstorj.io/storj/storagenode/pieces.(*CacheService).Run:57\n\tstorj.io/storj/private/lifecycle.(*Group).Run.func2.1:87\n\truntime/pprof.Do:51\n\tstorj.io/storj/private/lifecycle.(*Gr
oup).Run.func2:86\n\tgolang.org/x/sync/errgroup.(*Group).Go.func1:75", "Satellite ID": "121RTSDpyNZVcEU84Ticf2L1ntiuUimbWgfATz21tuvgk3vzoA6"}
2023-12-14T20:52:46+01:00	INFO	lazyfilewalker.used-space-filewalker	starting subprocess	{"satelliteID": "12EayRS2V1kEsWESU9QMRseFhdxYxKicsiFmxrsLZHeLUtdps3S"}
2023-12-14T20:52:46+01:00	ERROR	lazyfilewalker.used-space-filewalker	failed to start subprocess	{"satelliteID": "12EayRS2V1kEsWESU9QMRseFhdxYxKicsiFmxrsLZHeLUtdps3S", "error": "context canceled"}
2023-12-14T20:52:46+01:00	ERROR	pieces	failed to lazywalk space used by satellite	{"error": "lazyfilewalker: context canceled", "errorVerbose": "lazyfilewalker: context canceled\n\tstorj.io/storj/storage
node/pieces/lazyfilewalker.(*process).run:71\n\tstorj.io/storj/storagenode/pieces/lazyfilewalker.(*Supervisor).WalkAndComputeSpaceUsedBySatellite:105\n\tstorj.io/storj/storagenode/pieces.(*Store).SpaceUs
edTotalAndBySatellite:717\n\tstorj.io/storj/storagenode/pieces.(*CacheService).Run:57\n\tstorj.io/storj/private/lifecycle.(*Group).Run.func2.1:87\n\truntime/pprof.Do:51\n\tstorj.io/storj/private/lifecycl
e.(*Group).Run.func2:86\n\tgolang.org/x/sync/errgroup.(*Group).Go.func1:75", "Satellite ID": "12EayRS2V1kEsWESU9QMRseFhdxYxKicsiFmxrsLZHeLUtdps3S"}
2023-12-14T20:52:46+01:00	INFO	lazyfilewalker.used-space-filewalker	starting subprocess	{"satelliteID": "12L9ZFwhzVpuEKMUNUqkaTLGzwY9G24tbiigLiXpmZWKwmcNDDs"}
2023-12-14T20:52:46+01:00	ERROR	lazyfilewalker.used-space-filewalker	failed to start subprocess	{"satelliteID": "12L9ZFwhzVpuEKMUNUqkaTLGzwY9G24tbiigLiXpmZWKwmcNDDs", "error": "context canceled"}
2023-12-14T20:52:46+01:00	ERROR	pieces	failed to lazywalk space used by satellite	{"error": "lazyfilewalker: context canceled", "errorVerbose": "lazyfilewalker: context canceled\n\tstorj.io/storj/storage
node/pieces/lazyfilewalker.(*process).run:71\n\tstorj.io/storj/storagenode/pieces/lazyfilewalker.(*Supervisor).WalkAndComputeSpaceUsedBySatellite:105\n\tstorj.io/storj/storagenode/pieces.(*Store).SpaceUs
edTotalAndBySatellite:717\n\tstorj.io/storj/storagenode/pieces.(*CacheService).Run:57\n\tstorj.io/storj/private/lifecycle.(*Group).Run.func2.1:87\n\truntime/pprof.Do:51\n\tstorj.io/storj/private/lifecycl
e.(*Group).Run.func2:86\n\tgolang.org/x/sync/errgroup.(*Group).Go.func1:75", "Satellite ID": "12L9ZFwhzVpuEKMUNUqkaTLGzwY9G24tbiigLiXpmZWKwmcNDDs"}
2023-12-14T20:52:46+01:00	ERROR	piecestore:cache	error getting current used space: 	{"error": "filewalker: context canceled; filewalker: context canceled; filewalker: context canceled", "errorVerbose": 
"group:\n--- filewalker: context canceled\n\tstorj.io/storj/storagenode/pieces.(*FileWalker).WalkSatellitePieces:69\n\tstorj.io/storj/storagenode/pieces.(*FileWalker).WalkAndComputeSpaceUsedBySatellite:7
4\n\tstorj.io/storj/storagenode/pieces.(*Store).SpaceUsedTotalAndBySatellite:726\n\tstorj.io/storj/storagenode/pieces.(*CacheService).Run:57\n\tstorj.io/storj/private/lifecycle.(*Group).Run.func2.1:87\n\
truntime/pprof.Do:51\n\tstorj.io/storj/private/lifecycle.(*Group).Run.func2:86\n\tgolang.org/x/sync/errgroup.(*Group).Go.func1:75\n--- filewalker: context canceled\n\tstorj.io/storj/storagenode/pieces.(*
FileWalker).WalkSatellitePieces:69\n\tstorj.io/storj/storagenode/pieces.(*FileWalker).WalkAndComputeSpaceUsedBySatellite:74\n\tstorj.io/storj/storagenode/pieces.(*Store).SpaceUsedTotalAndBySatellite:726\
n\tstorj.io/storj/storagenode/pieces.(*CacheService).Run:57\n\tstorj.io/storj/private/lifecycle.(*Group).Run.func2.1:87\n\truntime/pprof.Do:51\n\tstorj.io/storj/private/lifecycle.(*Group).Run.func2:86\n\
tgolang.org/x/sync/errgroup.(*Group).Go.func1:75\n--- filewalker: context canceled\n\tstorj.io/storj/storagenode/pieces.(*FileWalker).WalkSatellitePieces:69\n\tstorj.io/storj/storagenode/pieces.(*FileWal
ker).WalkAndComputeSpaceUsedBySatellite:74\n\tstorj.io/storj/storagenode/pieces.(*Store).SpaceUsedTotalAndBySatellite:726\n\tstorj.io/storj/storagenode/pieces.(*CacheService).Run:57\n\tstorj.io/storj/pri
vate/lifecycle.(*Group).Run.func2.1:87\n\truntime/pprof.Do:51\n\tstorj.io/storj/private/lifecycle.(*Group).Run.func2:86\n\tgolang.org/x/sync/errgroup.(*Group).Go.func1:75"}
2023-12-14T20:52:48+01:00	INFO	lazyfilewalker.used-space-filewalker	starting subprocess	{"satelliteID": "12tRQrMTWUWwzwGh18i7Fqs67kmdhH9t6aToeiwbo5mfS2rUmo"}
2023-12-14T20:52:48+01:00	INFO	lazyfilewalker.used-space-filewalker	subprocess started	{"satelliteID": "12tRQrMTWUWwzwGh18i7Fqs67kmdhH9t6aToeiwbo5mfS2rUmo"}
2023-12-14T20:52:48+01:00	ERROR	blobscache	piecesTotal < 0	{"piecesTotal": -1792}
2023-12-14T20:52:48+01:00	ERROR	blobscache	piecesContentSize < 0	{"piecesContentSize": -1280}
2023-12-14T20:52:48+01:00	ERROR	blobscache	satPiecesTotal < 0	{"satPiecesTotal": -1792}
2023-12-14T20:52:48+01:00	ERROR	blobscache	satPiecesContentSize < 0	{"satPiecesContentSize": -1280}
2023-12-14T20:52:48+01:00	ERROR	blobscache	piecesTotal < 0	{"piecesTotal": -3328}
2023-12-14T20:52:48+01:00	ERROR	blobscache	piecesContentSize < 0	{"piecesContentSize": -2816}
2023-12-14T20:52:48+01:00	ERROR	blobscache	satPiecesTotal < 0	{"satPiecesTotal": -3328}
2023-12-14T20:52:48+01:00	ERROR	blobscache	satPiecesContentSize < 0	{"satPiecesContentSize": -2816}
2023-12-14T20:52:48+01:00	ERROR	blobscache	piecesTotal < 0	{"piecesTotal": -675584}
2023-12-14T20:52:48+01:00	ERROR	blobscache	piecesContentSize < 0	{"piecesContentSize": -675072}
2023-12-14T20:52:48+01:00	ERROR	blobscache	satPiecesTotal < 0	{"satPiecesTotal": -675584}
2023-12-14T20:52:48+01:00	ERROR	blobscache	satPiecesContentSize < 0	{"satPiecesContentSize": -675072}
2023-12-14T20:52:48+01:00	ERROR	blobscache	piecesTotal < 0	{"piecesTotal": -5120}
2023-12-14T20:52:48+01:00	ERROR	blobscache	piecesContentSize < 0	{"piecesContentSize": -4608}
2023-12-14T20:52:48+01:00	ERROR	blobscache	satPiecesTotal < 0	{"satPiecesTotal": -5120}
2023-12-14T20:52:48+01:00	ERROR	blobscache	satPiecesContentSize < 0	{"satPiecesContentSize": -4608}
2023-12-14T20:52:48+01:00	INFO	lazyfilewalker.used-space-filewalker.subprocess	Database started	{"satelliteID": "12tRQrMTWUWwzwGh18i7Fqs67kmdhH9t6aToeiwbo5mfS2rUmo", "process": "storagenode"}
2023-12-14T20:52:48+01:00	INFO	lazyfilewalker.used-space-filewalker.subprocess	used-space-filewalker started	{"satelliteID": "12tRQrMTWUWwzwGh18i7Fqs67kmdhH9t6aToeiwbo5mfS2rUmo", "process": 
"storagenode"}
2023-12-14T20:52:48+01:00	ERROR	blobscache	piecesTotal < 0	{"piecesTotal": -209408}
2023-12-14T20:52:48+01:00	ERROR	blobscache	piecesContentSize < 0	{"piecesContentSize": -208896}
2023-12-14T20:52:48+01:00	ERROR	blobscache	satPiecesTotal < 0	{"satPiecesTotal": -209408}
2023-12-14T20:52:48+01:00	ERROR	blobscache	satPiecesContentSize < 0	{"satPiecesContentSize": -208896}
2023-12-14T20:52:48+01:00	INFO	lazyfilewalker.used-space-filewalker.subprocess	used-space-filewalker completed	{"satelliteID": "12tRQrMTWUWwzwGh18i7Fqs67kmdhH9t6aToeiwbo5mfS2rUmo", "process": 
"storagenode", "piecesTotal": 0, "piecesContentSize": 0}
2023-12-14T20:52:48+01:00	INFO	lazyfilewalker.used-space-filewalker	subprocess finished successfully	{"satelliteID": "12tRQrMTWUWwzwGh18i7Fqs67kmdhH9t6aToeiwbo5mfS2rUmo"}
2023-12-14T20:52:48+01:00	INFO	lazyfilewalker.used-space-filewalker	starting subprocess	{"satelliteID": "1wFTAgs9DP5RSnCqKV1eLf6N9wtk4EAtmN5DpSxcs8EjT69tGE"}
2023-12-14T20:52:48+01:00	INFO	lazyfilewalker.used-space-filewalker	subprocess started	{"satelliteID": "1wFTAgs9DP5RSnCqKV1eLf6N9wtk4EAtmN5DpSxcs8EjT69tGE"}
2023-12-14T20:52:48+01:00	INFO	lazyfilewalker.used-space-filewalker.subprocess	Database started	{"satelliteID": "1wFTAgs9DP5RSnCqKV1eLf6N9wtk4EAtmN5DpSxcs8EjT69tGE", "process": "storagenode"}
2023-12-14T20:52:48+01:00	INFO	lazyfilewalker.used-space-filewalker.subprocess	used-space-filewalker started	{"satelliteID": "1wFTAgs9DP5RSnCqKV1eLf6N9wtk4EAtmN5DpSxcs8EjT69tGE", "process": 
"storagenode"}
2023-12-14T20:52:48+01:00	ERROR	blobscache	piecesTotal < 0	{"piecesTotal": -1024}
2023-12-14T20:52:48+01:00	ERROR	blobscache	piecesContentSize < 0	{"piecesContentSize": -512}
2023-12-14T20:52:48+01:00	ERROR	blobscache	satPiecesTotal < 0	{"satPiecesTotal": -1024}
2023-12-14T20:52:48+01:00	ERROR	blobscache	satPiecesContentSize < 0	{"satPiecesContentSize": -512}
2023-12-14T20:52:48+01:00	ERROR	blobscache	piecesTotal < 0	{"piecesTotal": -372480}
2023-12-14T20:52:48+01:00	ERROR	blobscache	piecesContentSize < 0	{"piecesContentSize": -371968}
2023-12-14T20:52:48+01:00	ERROR	blobscache	satPiecesTotal < 0	{"satPiecesTotal": -372480}
2023-12-14T20:52:48+01:00	ERROR	blobscache	satPiecesContentSize < 0	{"satPiecesContentSize": -371968}
2023-12-14T20:52:48+01:00	ERROR	blobscache	piecesTotal < 0	{"piecesTotal": -13568}
{"satPiecesContentSize": -111104}
2023-12-14T20:52:50+01:00	ERROR	blobscache	piecesTotal < 0	{"piecesTotal": -167424}
2023-12-14T20:52:50+01:00	ERROR	blobscache	piecesContentSize < 0	{"piecesContentSize": -166912}
2023-12-14T20:52:50+01:00	ERROR	blobscache	satPiecesTotal < 0	{"satPiecesTotal": -167424}
2023-12-14T20:52:50+01:00	ERROR	blobscache	satPiecesContentSize < 0	{"satPiecesContentSize": -166912}
2023-12-14T20:52:51+01:00	ERROR	blobscache	piecesTotal < 0	{"piecesTotal": -158720}
2023-12-14T20:52:51+01:00	ERROR	blobscache	piecesContentSize < 0	{"piecesContentSize": -158720}
2023-12-14T20:52:51+01:00	ERROR	blobscache	satPiecesTotal < 0	{"satPiecesTotal": -158720}
2023-12-14T20:52:51+01:00	ERROR	blobscache	satPiecesContentSize < 0	{"satPiecesContentSize": -158720}
2023-12-14T20:52:51+01:00	ERROR	blobscache	piecesTotal < 0	{"piecesTotal": -182272}
2023-12-14T20:52:51+01:00	ERROR	blobscache	piecesContentSize < 0	{"piecesContentSize": -181760}
2023-12-14T20:52:51+01:00	ERROR	blobscache	satPiecesTotal < 0	{"satPiecesTotal": -182272}
2023-12-14T20:52:51+01:00	ERROR	blobscache	satPiecesContentSize < 0	{"satPiecesContentSize": -181760}
2023-12-14T20:58:39+01:00	INFO	lazyfilewalker.used-space-filewalker.subprocess	used-space-filewalker completed	{"satelliteID": "1wFTAgs9DP5RSnCqKV1eLf6N9wtk4EAtmN5DpSxcs8EjT69tGE", "process": 
"storagenode", "piecesTotal": 114118100736, "piecesContentSize": 114089065728}
2023-12-14T20:58:40+01:00	INFO	lazyfilewalker.used-space-filewalker	subprocess finished successfully	{"satelliteID": "1wFTAgs9DP5RSnCqKV1eLf6N9wtk4EAtmN5DpSxcs8EjT69tGE"}
2023-12-14T20:58:40+01:00	INFO	lazyfilewalker.used-space-filewalker	starting subprocess	{"satelliteID": "121RTSDpyNZVcEU84Ticf2L1ntiuUimbWgfATz21tuvgk3vzoA6"}
2023-12-14T20:58:40+01:00	INFO	lazyfilewalker.used-space-filewalker	subprocess started	{"satelliteID": "121RTSDpyNZVcEU84Ticf2L1ntiuUimbWgfATz21tuvgk3vzoA6"}
2023-12-14T20:58:40+01:00	INFO	lazyfilewalker.used-space-filewalker.subprocess	Database started	{"satelliteID": "121RTSDpyNZVcEU84Ticf2L1ntiuUimbWgfATz21tuvgk3vzoA6", "process": "storagenode"}
2023-12-14T20:58:40+01:00	INFO	lazyfilewalker.used-space-filewalker.subprocess	used-space-filewalker started	{"satelliteID": "121RTSDpyNZVcEU84Ticf2L1ntiuUimbWgfATz21tuvgk3vzoA6", "process": 
"storagenode"}
2023-12-14T21:07:46+01:00	INFO	lazyfilewalker.used-space-filewalker	subprocess exited with status	{"satelliteID": "121RTSDpyNZVcEU84Ticf2L1ntiuUimbWgfATz21tuvgk3vzoA6", "status": 1, "error": "exit 
status 1"}
2023-12-14T21:07:46+01:00	ERROR	pieces	failed to lazywalk space used by satellite	{"error": "lazyfilewalker: exit status 1", "errorVerbose": "lazyfilewalker: exit status 1\n\tstorj.io/storj/storagenode/p
ieces/lazyfilewalker.(*process).run:83\n\tstorj.io/storj/storagenode/pieces/lazyfilewalker.(*Supervisor).WalkAndComputeSpaceUsedBySatellite:105\n\tstorj.io/storj/storagenode/pieces.(*Store).SpaceUsedTota
lAndBySatellite:717\n\tstorj.io/storj/storagenode/pieces.(*CacheService).Run:57\n\tstorj.io/storj/private/lifecycle.(*Group).Run.func2.1:87\n\truntime/pprof.Do:51\n\tstorj.io/storj/private/lifecycle.(*Gr
oup).Run.func2:86\n\tgolang.org/x/sync/errgroup.(*Group).Go.func1:75", "Satellite ID": "121RTSDpyNZVcEU84Ticf2L1ntiuUimbWgfATz21tuvgk3vzoA6"}
2023-12-14T21:07:46+01:00	INFO	lazyfilewalker.used-space-filewalker	starting subprocess	{"satelliteID": "12EayRS2V1kEsWESU9QMRseFhdxYxKicsiFmxrsLZHeLUtdps3S"}
2023-12-14T21:07:46+01:00	ERROR	lazyfilewalker.used-space-filewalker	failed to start subprocess	{"satelliteID": "12EayRS2V1kEsWESU9QMRseFhdxYxKicsiFmxrsLZHeLUtdps3S", "error": "context canceled"}
2023-12-14T21:07:46+01:00	ERROR	pieces	failed to lazywalk space used by satellite	{"error": "lazyfilewalker: context canceled", "errorVerbose": "lazyfilewalker: context canceled\n\tstorj.io/storj/storage
node/pieces/lazyfilewalker.(*process).run:71\n\tstorj.io/storj/storagenode/pieces/lazyfilewalker.(*Supervisor).WalkAndComputeSpaceUsedBySatellite:105\n\tstorj.io/storj/storagenode/pieces.(*Store).SpaceUs
edTotalAndBySatellite:717\n\tstorj.io/storj/storagenode/pieces.(*CacheService).Run:57\n\tstorj.io/storj/private/lifecycle.(*Group).Run.func2.1:87\n\truntime/pprof.Do:51\n\tstorj.io/storj/private/lifecycl
e.(*Group).Run.func2:86\n\tgolang.org/x/sync/errgroup.(*Group).Go.func1:75", "Satellite ID": "12EayRS2V1kEsWESU9QMRseFhdxYxKicsiFmxrsLZHeLUtdps3S"}
2023-12-14T21:07:46+01:00	INFO	lazyfilewalker.used-space-filewalker	starting subprocess	{"satelliteID": "12L9ZFwhzVpuEKMUNUqkaTLGzwY9G24tbiigLiXpmZWKwmcNDDs"}
2023-12-14T21:07:46+01:00	ERROR	lazyfilewalker.used-space-filewalker	failed to start subprocess	{"satelliteID": "12L9ZFwhzVpuEKMUNUqkaTLGzwY9G24tbiigLiXpmZWKwmcNDDs", "error": "context canceled"}
2023-12-14T21:07:46+01:00	ERROR	pieces	failed to lazywalk space used by satellite	{"error": "lazyfilewalker: context canceled", "errorVerbose": "lazyfilewalker: context canceled\n\tstorj.io/storj/storage
node/pieces/lazyfilewalker.(*process).run:71\n\tstorj.io/storj/storagenode/pieces/lazyfilewalker.(*Supervisor).WalkAndComputeSpaceUsedBySatellite:105\n\tstorj.io/storj/storagenode/pieces.(*Store).SpaceUs
edTotalAndBySatellite:717\n\tstorj.io/storj/storagenode/pieces.(*CacheService).Run:57\n\tstorj.io/storj/private/lifecycle.(*Group).Run.func2.1:87\n\truntime/pprof.Do:51\n\tstorj.io/storj/private/lifecycl
e.(*Group).Run.func2:86\n\tgolang.org/x/sync/errgroup.(*Group).Go.func1:75", "Satellite ID": "12L9ZFwhzVpuEKMUNUqkaTLGzwY9G24tbiigLiXpmZWKwmcNDDs"}
2023-12-14T21:07:46+01:00	ERROR	piecestore:cache	error getting current used space: 	{"error": "filewalker: context canceled; filewalker: context canceled; filewalker: context canceled", "errorVerbose": 
"group:\n--- filewalker: context canceled\n\tstorj.io/storj/storagenode/pieces.(*FileWalker).WalkSatellitePieces:69\n\tstorj.io/storj/storagenode/pieces.(*FileWalker).WalkAndComputeSpaceUsedBySatellite:7
4\n\tstorj.io/storj/storagenode/pieces.(*Store).SpaceUsedTotalAndBySatellite:726\n\tstorj.io/storj/storagenode/pieces.(*CacheService).Run:57\n\tstorj.io/storj/private/lifecycle.(*Group).Run.func2.1:87\n\
truntime/pprof.Do:51\n\tstorj.io/storj/private/lifecycle.(*Group).Run.func2:86\n\tgolang.org/x/sync/errgroup.(*Group).Go.func1:75\n--- filewalker: context canceled\n\tstorj.io/storj/storagenode/pieces.(*
FileWalker).WalkSatellitePieces:69\n\tstorj.io/storj/storagenode/pieces.(*FileWalker).WalkAndComputeSpaceUsedBySatellite:74\n\tstorj.io/storj/storagenode/pieces.(*Store).SpaceUsedTotalAndBySatellite:726\
n\tstorj.io/storj/storagenode/pieces.(*CacheService).Run:57\n\tstorj.io/storj/private/lifecycle.(*Group).Run.func2.1:87\n\truntime/pprof.Do:51\n\tstorj.io/storj/private/lifecycle.(*Group).Run.func2:86\n\
tgolang.org/x/sync/errgroup.(*Group).Go.func1:75\n--- filewalker: context canceled\n\tstorj.io/storj/storagenode/pieces.(*FileWalker).WalkSatellitePieces:69\n\tstorj.io/storj/storagenode/pieces.(*FileWal
ker).WalkAndComputeSpaceUsedBySatellite:74\n\tstorj.io/storj/storagenode/pieces.(*Store).SpaceUsedTotalAndBySatellite:726\n\tstorj.io/storj/storagenode/pieces.(*CacheService).Run:57\n\tstorj.io/storj/pri
vate/lifecycle.(*Group).Run.func2.1:87\n\truntime/pprof.Do:51\n\tstorj.io/storj/private/lifecycle.(*Group).Run.func2:86\n\tgolang.org/x/sync/errgroup.(*Group).Go.func1:75"}
2023-12-14T21:07:48+01:00	INFO	lazyfilewalker.used-space-filewalker	starting subprocess	{"satelliteID": "12tRQrMTWUWwzwGh18i7Fqs67kmdhH9t6aToeiwbo5mfS2rUmo"}
2023-12-14T21:07:48+01:00	INFO	lazyfilewalker.used-space-filewalker	subprocess started	{"satelliteID": "12tRQrMTWUWwzwGh18i7Fqs67kmdhH9t6aToeiwbo5mfS2rUmo"}

There is no indications of a problem with databases. All provided errors about the slowness of this VM and its storage.

This is a problem with the network configuration. You need to modify the storagenode service to wait until the network is ready, since your VM is so slow:

The filewalker process is unable to finish its work, because

so your storage is too slow to finish it in time.
However, since there is no FATAL errors, it shouldn’t stop. But I doubt that they are absent, because

You need to search for FATAL errors, because the service will not stop because of failed lazyfilewalker.
So I would suggest to fix the first problem and monitor the node.

1 Like

Hiya friend,

Thank you so much for your detailed explaination as always. I’ll move the affected nodes to their own seperate disks, and let them sit for some time. Then I’ll report back

Kind regards :slight_smile:

Do you run multiple nodes on the same disk? That’s bad and against TOS I believe. You won’t get more traffic this way, maybe the opposite.
The best way and recomended way is 1 node/1 disk.

1 Like