ERROR lazyfilewalker.gc-filewalker.subprocess failed to save progress in the database

no obvious issue so far, but the error message annoys:

2024-05-29T13:19:26Z    ERROR   lazyfilewalker.gc-filewalker.subprocess failed to save progress in the database {"Process": "storagenode", "satelliteID": "1wFTAgs9DP5RSnCqKV1eLf6N9wtk4EAtmN5DpSxcs8EjT69tGE", "errorVerbose": "gc_filewalker_progress_db: database is locked
\tstorj.io/storj/storagenode/storagenodedb.(*gcFilewalkerProgressDB).Store:33
\tstorj.io/storj/storagenode/pieces.(*FileWalker).WalkSatellitePiecesToTrash.func2:188
\tstorj.io/storj/storagenode/pieces.(*FileWalker).WalkSatellitePieces.func1:66
\tstorj.io/storj/storagenode/blobstore/filestore.walkNamespaceWithPrefix:999
\tstorj.io/storj/storagenode/blobstore/filestore.(*Dir).walkNamespaceUnderPath:897
\tstorj.io/storj/storagenode/blobstore/filestore.(*Dir).walkNamespaceInPath:842
\tstorj.io/storj/storagenode/blobstore/filestore.(*Dir).WalkNamespace:835
\tstorj.io/storj/storagenode/blobstore/filestore.(*blobStore).WalkNamespace:315
\tstorj.io/storj/storagenode/pieces.(*FileWalker).WalkSatellitePieces:54
\tstorj.io/storj/storagenode/pieces.(*FileWalker).WalkSatellitePiecesToTrash:178
\tstorj.io/storj/cmd/storagenode/internalcmd.gcCmdRun:104
\tstorj.io/storj/cmd/storagenode/internalcmd.NewGCFilewalkerCmd.func1:35
\tstorj.io/common/process.cleanup.func1.4:393
\tstorj.io/common/process.cleanup.func1:411
\tgithub.com/spf13/cobra.(*Command).execute:983
\tgithub.com/spf13/cobra.(*Command).ExecuteC:1115
\tgithub.com/spf13/cobra.(*Command).Execute:1039
\tstorj.io/common/process.ExecWithCustomOptions:112
\tmain.main:34
\truntime.main:267", "Process": "storagenode", "error": "gc_filewalker_progress_db: database is locked"}

nodes run with --pieces.enable-lazy-filewalker="true", without, they would hang-up.

would it help to delete gc_filewalker_progress_db, restart the nodes, in order to let them rebuild automatically?

Stop the node, check the filesystem, restart the node and see what happens. Of the problem remains, come back.

1 Like

this is mean that your disk cannot keep up. So, please stop the node and check the disk and fix it as suggested by @JWvdV

/dev/sdd1: clean, 32503357/305201152 files, 1739705657/2441608955 blocks
/dev/sde1: clean, 25707057/305201152 files, 1526770876/2441608955 blocks

maybe a hick-up?

Now:


2024-05-30T13:19:21Z    ERROR   gracefulexit:chore      error retrieving satellites.    {"Process": "storagenode", "error": "satellitesdb: database is locked", "errorVerbose": "satellitesdb: database is locked
\tstorj.io/storj/storagenode/storagenodedb.(*satellitesDB).ListGracefulExits:197
\tstorj.io/storj/storagenode/gracefulexit.(*Service).ListPendingExits:59
\tstorj.io/storj/storagenode/gracefulexit.(*Chore).AddMissing:55
\tstorj.io/common/sync2.(*Cycle).Run:160
\tstorj.io/storj/storagenode/gracefulexit.(*Chore).Run:48
\tstorj.io/storj/private/lifecycle.(*Group).Run.func2.1:87
\truntime/pprof.Do:51
\tstorj.io/storj/private/lifecycle.(*Group).Run.func2:86
\tgolang.org/x/sync/errgroup.(*Group).Go.func1:78"}

Exactly the same issue with a slow disk. How is it connected and what’s filesystem here?

details on the setup here - seems to be the same root cause (?).

i’ve changed the local SD to an extern SSD. is there an option to increase SWAP as a support for the RAM?

any hardware option I have, without completely replacing the RPi4B?

there’s something I do not understand:

/dev/sde1        10T  6.2T  3.3T  66% /mnt/WD1001
/dev/sdd1        10T  7.2T  2.4T  76% /mnt/WD1002

001 : 7.59TB is used (node dashboard)
002 : 8.62TB is used (node dashboard)

Why do stats differ that much?

Yes:

The difference between SI measure items (dashboard) and binary (OS). To check the free space and used space you need to use this command:

df --si

Please note, the used space here will be all used space, not only used in the allocation like on the dashboard.

1 Like
/dev/sde1        10T  6.2T  3.3T  66% /mnt/WD1001
/dev/sdd1        10T  7.0T  2.5T  75% /mnt/WD1002

Same …

How is it?
ok.
Is it used-space-filewalker finished for all satellites?

node 1:

pi@raspberrypi:~ $ less /mnt/WD1003/logs/sn1.log | grep "used-space-filewalker" | grep -E "finished"
2024-05-28T00:07:31Z	INFO	lazyfilewalker.used-space-filewalker	subprocess finished successfully	{"Process": "storagenode", "satelliteID": "121RTSDpyNZVcEU84Ticf2L1ntiuUimbWgfATz21tuvgk3vzoA6"}
2024-05-29T02:48:22Z	INFO	lazyfilewalker.used-space-filewalker	subprocess finished successfully	{"Process": "storagenode", "satelliteID": "12L9ZFwhzVpuEKMUNUqkaTLGzwY9G24tbiigLiXpmZWKwmcNDDs"}
2024-05-29T04:49:30Z	INFO	lazyfilewalker.used-space-filewalker	subprocess finished successfully	{"Process": "storagenode", "satelliteID": "1wFTAgs9DP5RSnCqKV1eLf6N9wtk4EAtmN5DpSxcs8EjT69tGE"}
2024-05-30T06:56:47Z	INFO	lazyfilewalker.used-space-filewalker	subprocess finished successfully	{"Process": "storagenode", "satelliteID": "121RTSDpyNZVcEU84Ticf2L1ntiuUimbWgfATz21tuvgk3vzoA6"}
2024-05-31T03:06:19Z	INFO	lazyfilewalker.used-space-filewalker	subprocess finished successfully	{"Process": "storagenode", "satelliteID": "121RTSDpyNZVcEU84Ticf2L1ntiuUimbWgfATz21tuvgk3vzoA6"}
2024-06-01T07:06:46Z	INFO	lazyfilewalker.used-space-filewalker	subprocess finished successfully	{"Process": "storagenode", "satelliteID": "121RTSDpyNZVcEU84Ticf2L1ntiuUimbWgfATz21tuvgk3vzoA6"}

node 2:

pi@raspberrypi:~ $ less /mnt/WD1003/logs/sn2.log | grep "used-space-filewalker" | grep -E "finished"
2024-05-26T16:51:36Z	INFO	lazyfilewalker.used-space-filewalker	subprocess finished successfully	{"Process": "storagenode", "satelliteID": "121RTSDpyNZVcEU84Ticf2L1ntiuUimbWgfATz21tuvgk3vzoA6"}
2024-05-27T22:07:50Z	INFO	lazyfilewalker.used-space-filewalker	subprocess finished successfully	{"Process": "storagenode", "satelliteID": "121RTSDpyNZVcEU84Ticf2L1ntiuUimbWgfATz21tuvgk3vzoA6"}
2024-05-28T16:24:57Z	INFO	lazyfilewalker.used-space-filewalker	subprocess finished successfully	{"Process": "storagenode", "satelliteID": "12L9ZFwhzVpuEKMUNUqkaTLGzwY9G24tbiigLiXpmZWKwmcNDDs"}
2024-05-28T16:58:04Z	INFO	lazyfilewalker.used-space-filewalker	subprocess finished successfully	{"Process": "storagenode", "satelliteID": "1wFTAgs9DP5RSnCqKV1eLf6N9wtk4EAtmN5DpSxcs8EjT69tGE"}
2024-05-30T04:06:20Z	INFO	lazyfilewalker.used-space-filewalker	subprocess finished successfully	{"Process": "storagenode", "satelliteID": "121RTSDpyNZVcEU84Ticf2L1ntiuUimbWgfATz21tuvgk3vzoA6"}
2024-05-30T15:53:25Z	INFO	lazyfilewalker.used-space-filewalker	subprocess finished successfully	{"Process": "storagenode", "satelliteID": "121RTSDpyNZVcEU84Ticf2L1ntiuUimbWgfATz21tuvgk3vzoA6"}
2024-05-30T23:15:18Z	INFO	lazyfilewalker.used-space-filewalker	subprocess finished successfully	{"Process": "storagenode", "satelliteID": "121RTSDpyNZVcEU84Ticf2L1ntiuUimbWgfATz21tuvgk3vzoA6"}
2024-06-01T11:58:27Z	INFO	lazyfilewalker.used-space-filewalker	subprocess finished successfully	{"Process": "storagenode", "satelliteID": "121RTSDpyNZVcEU84Ticf2L1ntiuUimbWgfATz21tuvgk3vzoA6"}

don’t like like they have been all finished.
at some point, I had to restart last couple of days due to error occurrence.

Did all other filewarkers are finished?

then they almost all will start from scratch (and databases will remain not updated of course).
So, either wait when all of them are finished, then restart or restart and wait until they all are finished.

Recently restarted with a much larger swap on the SSD (100 > 20480 MB).
I’ll keep you posted.

1 Like
pi@raspberrypi:~ $ less /mnt/WD1003/logs/sn2.log | grep "used-space-filewalker" | grep -E "finished|killed|started"
2024-06-02T13:11:35Z	INFO	lazyfilewalker.used-space-filewalker	subprocess finished successfully	{"Process": "storagenode", "satelliteID": "12L9ZFwhzVpuEKMUNUqkaTLGzwY9G24tbiigLiXpmZWKwmcNDDs"}
2024-06-02T13:11:35Z	INFO	lazyfilewalker.used-space-filewalker	subprocess started	{"Process": "storagenode", "satelliteID": "1wFTAgs9DP5RSnCqKV1eLf6N9wtk4EAtmN5DpSxcs8EjT69tGE"}
2024-06-02T13:11:35Z	INFO	lazyfilewalker.used-space-filewalker.subprocess	Database started	{"Process": "storagenode", "satelliteID": "1wFTAgs9DP5RSnCqKV1eLf6N9wtk4EAtmN5DpSxcs8EjT69tGE", "Process": "storagenode"}
2024-06-02T13:11:35Z	INFO	lazyfilewalker.used-space-filewalker.subprocess	used-space-filewalker started	{"Process": "storagenode", "satelliteID": "1wFTAgs9DP5RSnCqKV1eLf6N9wtk4EAtmN5DpSxcs8EjT69tGE", "Process": "storagenode"}
2024-06-02T15:13:22Z	INFO	lazyfilewalker.used-space-filewalker	subprocess finished successfully	{"Process": "storagenode", "satelliteID": "1wFTAgs9DP5RSnCqKV1eLf6N9wtk4EAtmN5DpSxcs8EjT69tGE"}
2024-06-02T15:13:22Z	INFO	lazyfilewalker.used-space-filewalker	subprocess started	{"Process": "storagenode", "satelliteID": "12EayRS2V1kEsWESU9QMRseFhdxYxKicsiFmxrsLZHeLUtdps3S"}
2024-06-02T15:13:22Z	INFO	lazyfilewalker.used-space-filewalker.subprocess	Database started	{"Process": "storagenode", "satelliteID": "12EayRS2V1kEsWESU9QMRseFhdxYxKicsiFmxrsLZHeLUtdps3S", "Process": "storagenode"}
2024-06-02T15:13:22Z	INFO	lazyfilewalker.used-space-filewalker.subprocess	used-space-filewalker started	{"Process": "storagenode", "satelliteID": "12EayRS2V1kEsWESU9QMRseFhdxYxKicsiFmxrsLZHeLUtdps3S", "Process": "storagenode"}
2024-06-03T23:14:34Z	INFO	lazyfilewalker.used-space-filewalker	subprocess exited with status	{"Process": "storagenode", "satelliteID": "12EayRS2V1kEsWESU9QMRseFhdxYxKicsiFmxrsLZHeLUtdps3S", "status": -1, "error": "signal: killed"}
2024-06-03T23:14:54Z	INFO	lazyfilewalker.used-space-filewalker	subprocess started	{"Process": "storagenode", "satelliteID": "121RTSDpyNZVcEU84Ticf2L1ntiuUimbWgfATz21tuvgk3vzoA6"}
2024-06-03T23:14:54Z	INFO	lazyfilewalker.used-space-filewalker.subprocess	Database started	{"Process": "storagenode", "satelliteID": "121RTSDpyNZVcEU84Ticf2L1ntiuUimbWgfATz21tuvgk3vzoA6", "Process": "storagenode"}
2024-06-03T23:14:54Z	INFO	lazyfilewalker.used-space-filewalker.subprocess	used-space-filewalker started	{"Process": "storagenode", "satelliteID": "121RTSDpyNZVcEU84Ticf2L1ntiuUimbWgfATz21tuvgk3vzoA6", "Process": "storagenode"}
2024-06-03T23:15:10Z	INFO	lazyfilewalker.used-space-filewalker	subprocess started	{"Process": "storagenode", "satelliteID": "121RTSDpyNZVcEU84Ticf2L1ntiuUimbWgfATz21tuvgk3vzoA6"}
2024-06-03T23:15:10Z	INFO	lazyfilewalker.used-space-filewalker.subprocess	Database started	{"Process": "storagenode", "satelliteID": "121RTSDpyNZVcEU84Ticf2L1ntiuUimbWgfATz21tuvgk3vzoA6", "Process": "storagenode"}
2024-06-03T23:15:10Z	INFO	lazyfilewalker.used-space-filewalker.subprocess	used-space-filewalker started	{"Process": "storagenode", "satelliteID": "121RTSDpyNZVcEU84Ticf2L1ntiuUimbWgfATz21tuvgk3vzoA6", "Process": "storagenode"}
2024-06-04T02:52:45Z	INFO	lazyfilewalker.used-space-filewalker	subprocess finished successfully	{"Process": "storagenode", "satelliteID": "121RTSDpyNZVcEU84Ticf2L1ntiuUimbWgfATz21tuvgk3vzoA6"}
2024-06-04T02:52:45Z	INFO	lazyfilewalker.used-space-filewalker	subprocess started	{"Process": "storagenode", "satelliteID": "12L9ZFwhzVpuEKMUNUqkaTLGzwY9G24tbiigLiXpmZWKwmcNDDs"}
2024-06-04T02:52:45Z	INFO	lazyfilewalker.used-space-filewalker.subprocess	Database started	{"Process": "storagenode", "satelliteID": "12L9ZFwhzVpuEKMUNUqkaTLGzwY9G24tbiigLiXpmZWKwmcNDDs", "Process": "storagenode"}
2024-06-04T02:52:45Z	INFO	lazyfilewalker.used-space-filewalker.subprocess	used-space-filewalker started	{"Process": "storagenode", "satelliteID": "12L9ZFwhzVpuEKMUNUqkaTLGzwY9G24tbiigLiXpmZWKwmcNDDs", "Process": "storagenode"}
2024-06-05T00:29:23Z	INFO	lazyfilewalker.used-space-filewalker	subprocess finished successfully	{"Process": "storagenode", "satelliteID": "12L9ZFwhzVpuEKMUNUqkaTLGzwY9G24tbiigLiXpmZWKwmcNDDs"}
2024-06-05T00:29:23Z	INFO	lazyfilewalker.used-space-filewalker	subprocess started	{"Process": "storagenode", "satelliteID": "1wFTAgs9DP5RSnCqKV1eLf6N9wtk4EAtmN5DpSxcs8EjT69tGE"}
2024-06-05T00:29:24Z	INFO	lazyfilewalker.used-space-filewalker.subprocess	Database started	{"Process": "storagenode", "satelliteID": "1wFTAgs9DP5RSnCqKV1eLf6N9wtk4EAtmN5DpSxcs8EjT69tGE", "Process": "storagenode"}
2024-06-05T00:29:24Z	INFO	lazyfilewalker.used-space-filewalker.subprocess	used-space-filewalker started	{"Process": "storagenode", "satelliteID": "1wFTAgs9DP5RSnCqKV1eLf6N9wtk4EAtmN5DpSxcs8EjT69tGE", "Process": "storagenode"}
2024-06-05T15:49:08Z	INFO	lazyfilewalker.used-space-filewalker	subprocess finished successfully	{"Process": "storagenode", "satelliteID": "1wFTAgs9DP5RSnCqKV1eLf6N9wtk4EAtmN5DpSxcs8EjT69tGE"}
2024-06-05T15:49:08Z	INFO	lazyfilewalker.used-space-filewalker	subprocess started	{"Process": "storagenode", "satelliteID": "12EayRS2V1kEsWESU9QMRseFhdxYxKicsiFmxrsLZHeLUtdps3S"}
2024-06-05T15:49:08Z	INFO	lazyfilewalker.used-space-filewalker.subprocess	Database started	{"Process": "storagenode", "satelliteID": "12EayRS2V1kEsWESU9QMRseFhdxYxKicsiFmxrsLZHeLUtdps3S", "Process": "storagenode"}
2024-06-05T15:49:08Z	INFO	lazyfilewalker.used-space-filewalker.subprocess	used-space-filewalker started	{"Process": "storagenode", "satelliteID": "12EayRS2V1kEsWESU9QMRseFhdxYxKicsiFmxrsLZHeLUtdps3S", "Process": "storagenode"}
pi@raspberrypi:~ $ less /mnt/WD1003/logs/sn1.log | grep "used-space-filewalker" | grep -E "finished|killed|started"
2024-06-02T18:43:01Z	INFO	lazyfilewalker.used-space-filewalker	subprocess finished successfully	{"Process": "storagenode", "satelliteID": "12L9ZFwhzVpuEKMUNUqkaTLGzwY9G24tbiigLiXpmZWKwmcNDDs"}
2024-06-02T18:43:01Z	INFO	lazyfilewalker.used-space-filewalker	subprocess started	{"Process": "storagenode", "satelliteID": "1wFTAgs9DP5RSnCqKV1eLf6N9wtk4EAtmN5DpSxcs8EjT69tGE"}
2024-06-02T18:43:01Z	INFO	lazyfilewalker.used-space-filewalker.subprocess	Database started	{"Process": "storagenode", "satelliteID": "1wFTAgs9DP5RSnCqKV1eLf6N9wtk4EAtmN5DpSxcs8EjT69tGE", "Process": "storagenode"}
2024-06-02T18:43:01Z	INFO	lazyfilewalker.used-space-filewalker.subprocess	used-space-filewalker started	{"Process": "storagenode", "satelliteID": "1wFTAgs9DP5RSnCqKV1eLf6N9wtk4EAtmN5DpSxcs8EjT69tGE", "Process": "storagenode"}
2024-06-03T05:45:35Z	INFO	lazyfilewalker.used-space-filewalker	subprocess finished successfully	{"Process": "storagenode", "satelliteID": "1wFTAgs9DP5RSnCqKV1eLf6N9wtk4EAtmN5DpSxcs8EjT69tGE"}
2024-06-03T05:45:35Z	INFO	lazyfilewalker.used-space-filewalker	subprocess started	{"Process": "storagenode", "satelliteID": "12EayRS2V1kEsWESU9QMRseFhdxYxKicsiFmxrsLZHeLUtdps3S"}
2024-06-03T05:45:35Z	INFO	lazyfilewalker.used-space-filewalker.subprocess	Database started	{"Process": "storagenode", "satelliteID": "12EayRS2V1kEsWESU9QMRseFhdxYxKicsiFmxrsLZHeLUtdps3S", "Process": "storagenode"}
2024-06-03T05:45:35Z	INFO	lazyfilewalker.used-space-filewalker.subprocess	used-space-filewalker started	{"Process": "storagenode", "satelliteID": "12EayRS2V1kEsWESU9QMRseFhdxYxKicsiFmxrsLZHeLUtdps3S", "Process": "storagenode"}

just one don’t - but I wonder why on one node there’s no filewalker running since 3 days (?).

Seems US1 cannot finish the used-space-filewalker:

There are only starts

Due to high data load or could there be another root cause?

Due to restarts I suppose?

not since a week, but today, yes, because the Pi hang up.

Out of curiosity - it is also taking me many days to run the lazy filewalker. I’m willing to run the non-lazy version of the filewalker to hopefully speed it up, but my problem with that is I can’t find any keyword to search the logs for to find what the non-lazy version is doing. Grepping for walk doesn’t find anything relevant, neither does grepping for used. Does the non-lazy version have any log messaging at all, and if so, what can I search for, please?

Thanks. Last time I really searched for this issue was around a month ago, shoulda looked again before asking.

1 Like