Error starting master database on storagenode: database: secret opening file "config/storage/secret.db" failed

Aka985 · February 10, 2024, 6:22pm

Hello Sirs,
One of my node went offline with this error:

C:\Users\aka>docker logs --tail 50 storagenodeD1.12
        runtime.main:267
2024-02-10 17:29:43,917 INFO exited: storagenode (exit status 1; not expected)
2024-02-10 17:29:45,920 INFO spawned: 'storagenode' with pid 43
2024-02-10T17:29:45Z    INFO    Configuration loaded    {"process": "storagenode", "Location": "/app/config/config.yaml"}
2024-02-10T17:29:45Z    INFO    Anonymized tracing enabled      {"process": "storagenode"}
2024-02-10T17:29:45Z    INFO    Operator email  {"process": "storagenode", "Address": "7437493@gmail.com"}
2024-02-10T17:29:45Z    INFO    Operator wallet {"process": "storagenode", "Address": "0x8675290882f594227d9b69d1fc434bf54b2b5e6f"}
Error: Error starting master database on storagenode: database: secret opening file "config/storage/secret.db" failed: unable to open database file: no such file or directory
        storj.io/storj/storagenode/storagenodedb.(*DB).openDatabase:347
        storj.io/storj/storagenode/storagenodedb.(*DB).openExistingDatabase:324
        storj.io/storj/storagenode/storagenodedb.(*DB).openDatabases:299
        storj.io/storj/storagenode/storagenodedb.OpenExisting:266
        main.cmdRun:60
        main.newRunCmd.func1:32
        storj.io/private/process.cleanup.func1.4:393
        storj.io/private/process.cleanup.func1:411
        github.com/spf13/cobra.(*Command).execute:852
        github.com/spf13/cobra.(*Command).ExecuteC:960
        github.com/spf13/cobra.(*Command).Execute:897
        storj.io/private/process.ExecWithCustomOptions:112
        main.main:30
        runtime.main:267
2024-02-10 17:29:46,116 INFO exited: storagenode (exit status 1; not expected)
2024-02-10 17:29:49,119 INFO spawned: 'storagenode' with pid 51
2024-02-10T17:29:49Z    INFO    Configuration loaded    {"process": "storagenode", "Location": "/app/config/config.yaml"}
2024-02-10T17:29:49Z    INFO    Anonymized tracing enabled      {"process": "storagenode"}
2024-02-10T17:29:49Z    INFO    Operator email  {"process": "storagenode", "Address": "7437493@gmail.com"}
2024-02-10T17:29:49Z    INFO    Operator wallet {"process": "storagenode", "Address": "0x8675290882f594227d9b69d1fc434bf54b2b5e6f"}
Error: Error starting master database on storagenode: database: secret opening file "config/storage/secret.db" failed: unable to open database file: no such file or directory
        storj.io/storj/storagenode/storagenodedb.(*DB).openDatabase:347
        storj.io/storj/storagenode/storagenodedb.(*DB).openExistingDatabase:324
        storj.io/storj/storagenode/storagenodedb.(*DB).openDatabases:299
        storj.io/storj/storagenode/storagenodedb.OpenExisting:266
        main.cmdRun:60
        main.newRunCmd.func1:32
        storj.io/private/process.cleanup.func1.4:393
        storj.io/private/process.cleanup.func1:411
        github.com/spf13/cobra.(*Command).execute:852
        github.com/spf13/cobra.(*Command).ExecuteC:960
        github.com/spf13/cobra.(*Command).Execute:897
        storj.io/private/process.ExecWithCustomOptions:112
        main.main:30
        runtime.main:267
2024-02-10 17:29:49,327 INFO exited: storagenode (exit status 1; not expected)
2024-02-10 17:29:50,329 INFO gave up: storagenode entered FATAL state, too many start retries too quickly
2024-02-10 17:29:51,330 WARN received SIGQUIT indicating exit request
2024-02-10 17:29:51,330 INFO waiting for processes-exit-eventlistener, storagenode-updater to die
2024-02-10T17:29:51Z    INFO    Got a signal from the OS: "terminated"  {"Process": "storagenode-updater"}
2024-02-10 17:29:51,332 INFO stopped: storagenode-updater (exit status 0)
2024-02-10 17:29:52,333 INFO stopped: processes-exit-eventlistener (terminated by SIGTERM)

Other 3 nodes on same machine works normally. Reboot and docker reset to factory defaults doesn’t help. Checkdisk found no errors. SFС found some errors and fixed it. DISM found no errors.

This is my working computer with a lot of different software. Node went down at night time, I didn’t used computer it what time.

What else I can do to find reason of this error?

Regards,
Alexander

Aka985 · February 10, 2024, 6:45pm

UPDATE:

After SFC I started the node and it report about database malformed. I recreated database and node become online and was working few minutes:

But works unstable, I checked the logs and see this:

C:\Users\aka>docker logs --tail 50 storagenodeD1.12
2024-02-10T18:37:37Z    INFO    preflight:localtime     local system clock is in sync with trusted satellites' system clock.    {"process": "storagenode"}
2024-02-10T18:37:37Z    INFO    bandwidth       Performing bandwidth usage rollups      {"process": "storagenode"}
2024-02-10T18:37:37Z    INFO    Node 12dha8yhAHeYTyCFdRm12YCQYZQwsX6BWUGksUcsdxrRUupNArF started        {"process": "storagenode"}
2024-02-10T18:37:37Z    INFO    Public server started on [::]:28967     {"process": "storagenode"}
2024-02-10T18:37:37Z    INFO    Private server started on 127.0.0.1:7778        {"process": "storagenode"}
2024-02-10T18:37:37Z    INFO    failed to sufficiently increase receive buffer size (was: 208 kiB, wanted: 2048 kiB, got: 416 kiB). See https://github.com/quic-go/quic-go/wiki/UDP-Buffer-Sizes for details.   {"process": "storagenode"}
2024-02-10T18:37:37Z    INFO    trust   Scheduling next refresh {"process": "storagenode", "after": "9h52m3.153012752s"}
2024-02-10T18:37:37Z    INFO    pieces:trash    emptying trash started  {"process": "storagenode", "Satellite ID": "12L9ZFwhzVpuEKMUNUqkaTLGzwY9G24tbiigLiXpmZWKwmcNDDs"}
2024-02-10T18:37:37Z    WARN    piecestore:monitor      Disk space is less than requested. Allocated space is   {"process": "storagenode", "bytes": 5037903827139}
2024-02-10T18:37:37Z    ERROR   services        unexpected shutdown of a runner {"process": "storagenode", "name": "piecestore:monitor", "error": "piecestore monitor: error verifying location and/or readability of storage directory: open config/storage/storage-dir-verification: no such file or directory", "errorVerbose": "piecestore monitor: error verifying location and/or readability of storage directory: open config/storage/storage-dir-verification: no such file or directory\n\tstorj.io/storj/storagenode/monitor.(*Service).Run.func1.1:158\n\tstorj.io/common/sync2.(*Cycle).Run:99\n\tstorj.io/storj/storagenode/monitor.(*Service).Run.func1:141\n\tgolang.org/x/sync/errgroup.(*Group).Go.func1:75"}
2024-02-10T18:37:37Z    ERROR   version failed to get process version info      {"process": "storagenode", "error": "version checker client: Get \"https://version.storj.io\": context canceled", "errorVerbose": "version checker client: Get \"https://version.storj.io\": context canceled\n\tstorj.io/storj/private/version/checker.(*Client).All:68\n\tstorj.io/storj/private/version/checker.(*Client).Process:108\n\tstorj.io/storj/private/version/checker.(*Service).checkVersion:101\n\tstorj.io/storj/private/version/checker.(*Service).CheckVersion:75\n\tstorj.io/storj/storagenode/version.(*Chore).Run.func1:65\n\tstorj.io/common/sync2.(*Cycle).Run:99\n\tstorj.io/storj/storagenode/version.(*Chore).Run:64\n\tstorj.io/storj/private/lifecycle.(*Group).Run.func2.1:87\n\truntime/pprof.Do:51\n\tstorj.io/storj/private/lifecycle.(*Group).Run.func2:86\n\tgolang.org/x/sync/errgroup.(*Group).Go.func1:75"}
2024-02-10T18:37:37Z    ERROR   nodestats:cache Get pricing-model/join date failed      {"process": "storagenode", "error": "context canceled"}
2024-02-10T18:37:37Z    ERROR   contact:service ping satellite failed   {"process": "storagenode", "Satellite ID": "1wFTAgs9DP5RSnCqKV1eLf6N9wtk4EAtmN5DpSxcs8EjT69tGE", "attempts": 1, "error": "ping satellite: rpc: tcp connector failed: rpc: dial tcp: lookup saltlake.tardigrade.io: operation was canceled", "errorVerbose": "ping satellite: rpc: tcp connector failed: rpc: dial tcp: lookup saltlake.tardigrade.io: operation was canceled\n\tstorj.io/common/rpc.HybridConnector.DialContext.func1:190"}
2024-02-10T18:37:37Z    INFO    contact:service context cancelled       {"process": "storagenode", "Satellite ID": "1wFTAgs9DP5RSnCqKV1eLf6N9wtk4EAtmN5DpSxcs8EjT69tGE"}
2024-02-10T18:37:37Z    ERROR   contact:service ping satellite failed   {"process": "storagenode", "Satellite ID": "121RTSDpyNZVcEU84Ticf2L1ntiuUimbWgfATz21tuvgk3vzoA6", "attempts": 1, "error": "ping satellite: rpc: tcp connector failed: rpc: dial tcp: lookup ap1.storj.io: operation was canceled", "errorVerbose": "ping satellite: rpc: tcp connector failed: rpc: dial tcp: lookup ap1.storj.io: operation was canceled\n\tstorj.io/common/rpc.HybridConnector.DialContext.func1:190"}
2024-02-10T18:37:37Z    INFO    contact:service context cancelled       {"process": "storagenode", "Satellite ID": "121RTSDpyNZVcEU84Ticf2L1ntiuUimbWgfATz21tuvgk3vzoA6"}
2024-02-10T18:37:37Z    ERROR   contact:service ping satellite failed   {"process": "storagenode", "Satellite ID": "12EayRS2V1kEsWESU9QMRseFhdxYxKicsiFmxrsLZHeLUtdps3S", "attempts": 1, "error": "ping satellite: rpc: tcp connector failed: rpc: dial tcp: lookup us1.storj.io: operation was canceled", "errorVerbose": "ping satellite: rpc: tcp connector failed: rpc: dial tcp: lookup us1.storj.io: operation was canceled\n\tstorj.io/common/rpc.HybridConnector.DialContext.func1:190"}
2024-02-10T18:37:37Z    INFO    contact:service context cancelled       {"process": "storagenode", "Satellite ID": "12EayRS2V1kEsWESU9QMRseFhdxYxKicsiFmxrsLZHeLUtdps3S"}
2024-02-10T18:37:37Z    ERROR   contact:service ping satellite failed   {"process": "storagenode", "Satellite ID": "12L9ZFwhzVpuEKMUNUqkaTLGzwY9G24tbiigLiXpmZWKwmcNDDs", "attempts": 1, "error": "ping satellite: rpc: tcp connector failed: rpc: dial tcp: lookup eu1.storj.io: operation was canceled", "errorVerbose": "ping satellite: rpc: tcp connector failed: rpc: dial tcp: lookup eu1.storj.io: operation was canceled\n\tstorj.io/common/rpc.HybridConnector.DialContext.func1:190"}
2024-02-10T18:37:37Z    INFO    contact:service context cancelled       {"process": "storagenode", "Satellite ID": "12L9ZFwhzVpuEKMUNUqkaTLGzwY9G24tbiigLiXpmZWKwmcNDDs"}
2024-02-10T18:37:37Z    ERROR   collector       error during collecting pieces:         {"process": "storagenode", "error": "context canceled"}
2024-02-10T18:37:37Z    INFO    lazyfilewalker.used-space-filewalker    starting subprocess     {"process": "storagenode", "satelliteID": "12rfG3sh9NCWiX3ivPjq2HtdLmbqCrvHVEzJubnzFzosMuawymB"}
2024-02-10T18:37:37Z    ERROR   lazyfilewalker.used-space-filewalker    failed to start subprocess      {"process": "storagenode", "satelliteID": "12rfG3sh9NCWiX3ivPjq2HtdLmbqCrvHVEzJubnzFzosMuawymB", "error": "context canceled"}
2024-02-10T18:37:37Z    ERROR   pieces  failed to lazywalk space used by satellite      {"process": "storagenode", "error": "lazyfilewalker: context canceled", "errorVerbose": "lazyfilewalker: context canceled\n\tstorj.io/storj/storagenode/pieces/lazyfilewalker.(*process).run:71\n\tstorj.io/storj/storagenode/pieces/lazyfilewalker.(*Supervisor).WalkAndComputeSpaceUsedBySatellite:105\n\tstorj.io/storj/storagenode/pieces.(*Store).SpaceUsedTotalAndBySatellite:717\n\tstorj.io/storj/storagenode/pieces.(*CacheService).Run:57\n\tstorj.io/storj/private/lifecycle.(*Group).Run.func2.1:87\n\truntime/pprof.Do:51\n\tstorj.io/storj/private/lifecycle.(*Group).Run.func2:86\n\tgolang.org/x/sync/errgroup.(*Group).Go.func1:75", "Satellite ID": "12rfG3sh9NCWiX3ivPjq2HtdLmbqCrvHVEzJubnzFzosMuawymB"}
2024-02-10T18:37:37Z    INFO    lazyfilewalker.used-space-filewalker    starting subprocess     {"process": "storagenode", "satelliteID": "12tRQrMTWUWwzwGh18i7Fqs67kmdhH9t6aToeiwbo5mfS2rUmo"}
2024-02-10T18:37:37Z    ERROR   lazyfilewalker.used-space-filewalker    failed to start subprocess      {"process": "storagenode", "satelliteID": "12tRQrMTWUWwzwGh18i7Fqs67kmdhH9t6aToeiwbo5mfS2rUmo", "error": "context canceled"}
2024-02-10T18:37:37Z    ERROR   pieces  failed to lazywalk space used by satellite      {"process": "storagenode", "error": "lazyfilewalker: context canceled", "errorVerbose": "lazyfilewalker: context canceled\n\tstorj.io/storj/storagenode/pieces/lazyfilewalker.(*process).run:71\n\tstorj.io/storj/storagenode/pieces/lazyfilewalker.(*Supervisor).WalkAndComputeSpaceUsedBySatellite:105\n\tstorj.io/storj/storagenode/pieces.(*Store).SpaceUsedTotalAndBySatellite:717\n\tstorj.io/storj/storagenode/pieces.(*CacheService).Run:57\n\tstorj.io/storj/private/lifecycle.(*Group).Run.func2.1:87\n\truntime/pprof.Do:51\n\tstorj.io/storj/private/lifecycle.(*Group).Run.func2:86\n\tgolang.org/x/sync/errgroup.(*Group).Go.func1:75", "Satellite ID": "12tRQrMTWUWwzwGh18i7Fqs67kmdhH9t6aToeiwbo5mfS2rUmo"}
2024-02-10T18:37:37Z    INFO    lazyfilewalker.used-space-filewalker    starting subprocess     {"process": "storagenode", "satelliteID": "1wFTAgs9DP5RSnCqKV1eLf6N9wtk4EAtmN5DpSxcs8EjT69tGE"}
2024-02-10T18:37:37Z    ERROR   lazyfilewalker.used-space-filewalker    failed to start subprocess      {"process": "storagenode", "satelliteID": "1wFTAgs9DP5RSnCqKV1eLf6N9wtk4EAtmN5DpSxcs8EjT69tGE", "error": "context canceled"}
2024-02-10T18:37:37Z    ERROR   pieces  failed to lazywalk space used by satellite      {"process": "storagenode", "error": "lazyfilewalker: context canceled", "errorVerbose": "lazyfilewalker: context canceled\n\tstorj.io/storj/storagenode/pieces/lazyfilewalker.(*process).run:71\n\tstorj.io/storj/storagenode/pieces/lazyfilewalker.(*Supervisor).WalkAndComputeSpaceUsedBySatellite:105\n\tstorj.io/storj/storagenode/pieces.(*Store).SpaceUsedTotalAndBySatellite:717\n\tstorj.io/storj/storagenode/pieces.(*CacheService).Run:57\n\tstorj.io/storj/private/lifecycle.(*Group).Run.func2.1:87\n\truntime/pprof.Do:51\n\tstorj.io/storj/private/lifecycle.(*Group).Run.func2:86\n\tgolang.org/x/sync/errgroup.(*Group).Go.func1:75", "Satellite ID": "1wFTAgs9DP5RSnCqKV1eLf6N9wtk4EAtmN5DpSxcs8EjT69tGE"}
2024-02-10T18:37:37Z    INFO    lazyfilewalker.used-space-filewalker    starting subprocess     {"process": "storagenode", "satelliteID": "121RTSDpyNZVcEU84Ticf2L1ntiuUimbWgfATz21tuvgk3vzoA6"}
2024-02-10T18:37:37Z    ERROR   lazyfilewalker.used-space-filewalker    failed to start subprocess      {"process": "storagenode", "satelliteID": "121RTSDpyNZVcEU84Ticf2L1ntiuUimbWgfATz21tuvgk3vzoA6", "error": "context canceled"}
2024-02-10T18:37:37Z    ERROR   pieces  failed to lazywalk space used by satellite      {"process": "storagenode", "error": "lazyfilewalker: context canceled", "errorVerbose": "lazyfilewalker: context canceled\n\tstorj.io/storj/storagenode/pieces/lazyfilewalker.(*process).run:71\n\tstorj.io/storj/storagenode/pieces/lazyfilewalker.(*Supervisor).WalkAndComputeSpaceUsedBySatellite:105\n\tstorj.io/storj/storagenode/pieces.(*Store).SpaceUsedTotalAndBySatellite:717\n\tstorj.io/storj/storagenode/pieces.(*CacheService).Run:57\n\tstorj.io/storj/private/lifecycle.(*Group).Run.func2.1:87\n\truntime/pprof.Do:51\n\tstorj.io/storj/private/lifecycle.(*Group).Run.func2:86\n\tgolang.org/x/sync/errgroup.(*Group).Go.func1:75", "Satellite ID": "121RTSDpyNZVcEU84Ticf2L1ntiuUimbWgfATz21tuvgk3vzoA6"}
2024-02-10T18:37:37Z    INFO    lazyfilewalker.used-space-filewalker    starting subprocess     {"process": "storagenode", "satelliteID": "12EayRS2V1kEsWESU9QMRseFhdxYxKicsiFmxrsLZHeLUtdps3S"}
2024-02-10T18:37:37Z    ERROR   lazyfilewalker.used-space-filewalker    failed to start subprocess      {"process": "storagenode", "satelliteID": "12EayRS2V1kEsWESU9QMRseFhdxYxKicsiFmxrsLZHeLUtdps3S", "error": "context canceled"}
2024-02-10T18:37:37Z    ERROR   pieces  failed to lazywalk space used by satellite      {"process": "storagenode", "error": "lazyfilewalker: context canceled", "errorVerbose": "lazyfilewalker: context canceled\n\tstorj.io/storj/storagenode/pieces/lazyfilewalker.(*process).run:71\n\tstorj.io/storj/storagenode/pieces/lazyfilewalker.(*Supervisor).WalkAndComputeSpaceUsedBySatellite:105\n\tstorj.io/storj/storagenode/pieces.(*Store).SpaceUsedTotalAndBySatellite:717\n\tstorj.io/storj/storagenode/pieces.(*CacheService).Run:57\n\tstorj.io/storj/private/lifecycle.(*Group).Run.func2.1:87\n\truntime/pprof.Do:51\n\tstorj.io/storj/private/lifecycle.(*Group).Run.func2:86\n\tgolang.org/x/sync/errgroup.(*Group).Go.func1:75", "Satellite ID": "12EayRS2V1kEsWESU9QMRseFhdxYxKicsiFmxrsLZHeLUtdps3S"}
2024-02-10T18:37:37Z    INFO    lazyfilewalker.used-space-filewalker    starting subprocess     {"process": "storagenode", "satelliteID": "12L9ZFwhzVpuEKMUNUqkaTLGzwY9G24tbiigLiXpmZWKwmcNDDs"}
2024-02-10T18:37:37Z    ERROR   lazyfilewalker.used-space-filewalker    failed to start subprocess      {"process": "storagenode", "satelliteID": "12L9ZFwhzVpuEKMUNUqkaTLGzwY9G24tbiigLiXpmZWKwmcNDDs", "error": "context canceled"}
2024-02-10T18:37:37Z    ERROR   pieces  failed to lazywalk space used by satellite      {"process": "storagenode", "error": "lazyfilewalker: context canceled", "errorVerbose": "lazyfilewalker: context canceled\n\tstorj.io/storj/storagenode/pieces/lazyfilewalker.(*process).run:71\n\tstorj.io/storj/storagenode/pieces/lazyfilewalker.(*Supervisor).WalkAndComputeSpaceUsedBySatellite:105\n\tstorj.io/storj/storagenode/pieces.(*Store).SpaceUsedTotalAndBySatellite:717\n\tstorj.io/storj/storagenode/pieces.(*CacheService).Run:57\n\tstorj.io/storj/private/lifecycle.(*Group).Run.func2.1:87\n\truntime/pprof.Do:51\n\tstorj.io/storj/private/lifecycle.(*Group).Run.func2:86\n\tgolang.org/x/sync/errgroup.(*Group).Go.func1:75", "Satellite ID": "12L9ZFwhzVpuEKMUNUqkaTLGzwY9G24tbiigLiXpmZWKwmcNDDs"}
2024-02-10T18:37:37Z    ERROR   piecestore:cache        error getting current used space:       {"process": "storagenode", "error": "filewalker: context canceled; filewalker: context canceled; filewalker: context canceled; filewalker: context canceled; filewalker: context canceled; filewalker: context canceled", "errorVerbose": "group:\n--- filewalker: context canceled\n\tstorj.io/storj/storagenode/pieces.(*FileWalker).WalkSatellitePieces:69\n\tstorj.io/storj/storagenode/pieces.(*FileWalker).WalkAndComputeSpaceUsedBySatellite:74\n\tstorj.io/storj/storagenode/pieces.(*Store).SpaceUsedTotalAndBySatellite:726\n\tstorj.io/storj/storagenode/pieces.(*CacheService).Run:57\n\tstorj.io/storj/private/lifecycle.(*Group).Run.func2.1:87\n\truntime/pprof.Do:51\n\tstorj.io/storj/private/lifecycle.(*Group).Run.func2:86\n\tgolang.org/x/sync/errgroup.(*Group).Go.func1:75\n--- filewalker: context canceled\n\tstorj.io/storj/storagenode/pieces.(*FileWalker).WalkSatellitePieces:69\n\tstorj.io/storj/storagenode/pieces.(*FileWalker).WalkAndComputeSpaceUsedBySatellite:74\n\tstorj.io/storj/storagenode/pieces.(*Store).SpaceUsedTotalAndBySatellite:726\n\tstorj.io/storj/storagenode/pieces.(*CacheService).Run:57\n\tstorj.io/storj/private/lifecycle.(*Group).Run.func2.1:87\n\truntime/pprof.Do:51\n\tstorj.io/storj/private/lifecycle.(*Group).Run.func2:86\n\tgolang.org/x/sync/errgroup.(*Group).Go.func1:75\n--- filewalker: context canceled\n\tstorj.io/storj/storagenode/pieces.(*FileWalker).WalkSatellitePieces:69\n\tstorj.io/storj/storagenode/pieces.(*FileWalker).WalkAndComputeSpaceUsedBySatellite:74\n\tstorj.io/storj/storagenode/pieces.(*Store).SpaceUsedTotalAndBySatellite:726\n\tstorj.io/storj/storagenode/pieces.(*CacheService).Run:57\n\tstorj.io/storj/private/lifecycle.(*Group).Run.func2.1:87\n\truntime/pprof.Do:51\n\tstorj.io/storj/private/lifecycle.(*Group).Run.func2:86\n\tgolang.org/x/sync/errgroup.(*Group).Go.func1:75\n--- filewalker: context canceled\n\tstorj.io/storj/storagenode/pieces.(*FileWalker).WalkSatellitePieces:69\n\tstorj.io/storj/storagenode/pieces.(*FileWalker).WalkAndComputeSpaceUsedBySatellite:74\n\tstorj.io/storj/storagenode/pieces.(*Store).SpaceUsedTotalAndBySatellite:726\n\tstorj.io/storj/storagenode/pieces.(*CacheService).Run:57\n\tstorj.io/storj/private/lifecycle.(*Group).Run.func2.1:87\n\truntime/pprof.Do:51\n\tstorj.io/storj/private/lifecycle.(*Group).Run.func2:86\n\tgolang.org/x/sync/errgroup.(*Group).Go.func1:75\n--- filewalker: context canceled\n\tstorj.io/storj/storagenode/pieces.(*FileWalker).WalkSatellitePieces:69\n\tstorj.io/storj/storagenode/pieces.(*FileWalker).WalkAndComputeSpaceUsedBySatellite:74\n\tstorj.io/storj/storagenode/pieces.(*Store).SpaceUsedTotalAndBySatellite:726\n\tstorj.io/storj/storagenode/pieces.(*CacheService).Run:57\n\tstorj.io/storj/private/lifecycle.(*Group).Run.func2.1:87\n\truntime/pprof.Do:51\n\tstorj.io/storj/private/lifecycle.(*Group).Run.func2:86\n\tgolang.org/x/sync/errgroup.(*Group).Go.func1:75\n--- filewalker: context canceled\n\tstorj.io/storj/storagenode/pieces.(*FileWalker).WalkSatellitePieces:69\n\tstorj.io/storj/storagenode/pieces.(*FileWalker).WalkAndComputeSpaceUsedBySatellite:74\n\tstorj.io/storj/storagenode/pieces.(*Store).SpaceUsedTotalAndBySatellite:726\n\tstorj.io/storj/storagenode/pieces.(*CacheService).Run:57\n\tstorj.io/storj/private/lifecycle.(*Group).Run.func2.1:87\n\truntime/pprof.Do:51\n\tstorj.io/storj/private/lifecycle.(*Group).Run.func2:86\n\tgolang.org/x/sync/errgroup.(*Group).Go.func1:75"}
2024-02-10T18:37:37Z    ERROR   pieces:trash    emptying trash failed   {"process": "storagenode", "error": "pieces error: filestore error: context canceled", "errorVerbose": "pieces error: filestore error: context canceled\n\tstorj.io/storj/storagenode/blobstore/filestore.(*blobStore).EmptyTrash:176\n\tstorj.io/storj/storagenode/pieces.(*BlobsUsageCache).EmptyTrash:316\n\tstorj.io/storj/storagenode/pieces.(*Store).EmptyTrash:416\n\tstorj.io/storj/storagenode/pieces.(*TrashChore).Run.func1.1:83\n\tstorj.io/common/sync2.(*Workplace).Start.func1:89"}
Error: piecestore monitor: error verifying location and/or readability of storage directory: open config/storage/storage-dir-verification: no such file or directory
2024-02-10 18:37:37,908 INFO exited: storagenode (exit status 1; not expected)
2024-02-10 18:37:38,911 INFO spawned: 'storagenode' with pid 42
2024-02-10 18:37:38,911 WARN received SIGQUIT indicating exit request
2024-02-10 18:37:38,911 INFO waiting for storagenode, processes-exit-eventlistener, storagenode-updater to die
2024-02-10T18:37:38Z    INFO    Got a signal from the OS: "terminated"  {"Process": "storagenode-updater"}
2024-02-10 18:37:38,913 INFO stopped: storagenode-updater (exit status 0)
2024-02-10 18:37:38,913 INFO stopped: storagenode (terminated by SIGTERM)
2024-02-10 18:37:38,914 INFO stopped: processes-exit-eventlistener (terminated by SIGTERM)

If I trying to run it with another IP line, node stay offline, but logs are same as above

Whis is only one SMR HDD of all my nodes. Can this be a reason, or something else?

Regards,
Alexander

nerdatwork · February 10, 2024, 7:32pm

Check to see if storage-dir-verification file is there or not.

Aka985 · February 10, 2024, 7:48pm

No, it is not:

But I didn’t delete it. Try to recreate?

nerdatwork · February 10, 2024, 7:55pm

Aka985 · February 10, 2024, 8:08pm

Is it same action, as we do when create a new node?

nerdatwork · February 11, 2024, 1:43am

Yes, follow the link given in that post.

Alexey · February 11, 2024, 2:41am

This file cannot disappear, it’s checked only for read, so the storagenode process cannot remove it accidentally, so something (or someone) removed this file, or do you perform a setup step every time? If so - please stop doing that, this step must be performed only once for the same identity.
The other issue could be if you accidentally changed the data location - I would recommend to check your disk with data, do you have a second copy of storage folder somewhere on this disk?

Aka985 · February 11, 2024, 1:30pm

I did, result:

File appeared:

Node offline:

Logs are:

So, is it dead?

Alexey · February 11, 2024, 1:53pm

Offline checklist:

Aka985 · February 11, 2024, 6:10pm

Ofcource I checked all this. Also with another IP. This is not a problem of port forwarding or firewall exclusion. This node was working with same setup before crush. I will bring this node to another location, just to be sure.

My main conclusion from this story - do not use your working or another heavy loaded computer for Storj. It is not question, what it will crush, it is a matter of time. One-two nodes are ok to keep on home PC, but if you want to have more- better to keep separate machine for Storj node.

Regards,
Alexander

Aka985 · February 11, 2024, 6:26pm

UPDATE

Alexey, you were right about connection. I found wrong IP, looks I didn’t push Save then change it to correct one.
Thank all of you guys for suport and tolerance!
I want to make it most reliable as possible, so bought some server HW and will move all nodes from working PC to the server, after its assembly and setup.

Regards,
Alexander

Alexey · February 12, 2024, 3:29am

They usually works fine, if you do not use the same disk for something else especially heavy loaded, like storing VM virtual disk on the same drive with storagenode data (my case), but even so my nodes are working fine even when all VMs (with other load) are running.
So perhaps your system has some hardware issues.

Aka985 · March 9, 2024, 5:07pm

Mistake “Error starting master database on storagenode: database: bandwidth opening file “config/storage/bandwidth.db” failed: file is not a database” appeared again. Same PC as at the begining of the topic, but another node 10Tb.

File storage-dit-verification exist, but only 1Kb

Shall I do Setup stae this case too?

Alexey · March 10, 2024, 3:14am

This is mean that you have either hardware issues, or power loss or some software changes files, it could be an advanced third-party antivirus (because the standard MS Defender doesn’t corrupt files).

it shouldn’t be big, so looks OK.

You need to re-create the bandwidth.db database using instruction:

Aka985 · March 10, 2024, 9:01pm

I did. Stopped the node, moved databeses to separate folder except storage-dir-verification:

Run the node, .db files were recreated:

Then started the node, but it gave lazyfilewalker mistake:

C:\Users\aka>docker logs --tail 50 storagenodeD1.2
2024-03-10T20:57:38Z    ERROR   pieces  failed to lazywalk space used by satellite      {"process": "storagenode", "error": "lazyfilewalker: context canceled", "errorVerbose": "lazyfilewalker: context canceled\n\tstorj.io/storj/storagenode/pieces/lazyfilewalker.(*process).run:71\n\tstorj.io/storj/storagenode/pieces/lazyfilewalker.(*Supervisor).WalkAndComputeSpaceUsedBySatellite:105\n\tstorj.io/storj/storagenode/pieces.(*Store).SpaceUsedTotalAndBySatellite:717\n\tstorj.io/storj/storagenode/pieces.(*CacheService).Run:57\n\tstorj.io/storj/private/lifecycle.(*Group).Run.func2.1:87\n\truntime/pprof.Do:51\n\tstorj.io/storj/private/lifecycle.(*Group).Run.func2:86\n\tgolang.org/x/sync/errgroup.(*Group).Go.func1:75", "Satellite ID": "12tRQrMTWUWwzwGh18i7Fqs67kmdhH9t6aToeiwbo5mfS2rUmo"}
2024-03-10T20:57:38Z    INFO    lazyfilewalker.used-space-filewalker    starting subprocess     {"process": "storagenode", "satelliteID": "1wFTAgs9DP5RSnCqKV1eLf6N9wtk4EAtmN5DpSxcs8EjT69tGE"}
2024-03-10T20:57:38Z    ERROR   lazyfilewalker.used-space-filewalker    failed to start subprocess      {"process": "storagenode", "satelliteID": "1wFTAgs9DP5RSnCqKV1eLf6N9wtk4EAtmN5DpSxcs8EjT69tGE", "error": "context canceled"}
2024-03-10T20:57:38Z    ERROR   pieces  failed to lazywalk space used by satellite      {"process": "storagenode", "error": "lazyfilewalker: context canceled", "errorVerbose": "lazyfilewalker: context canceled\n\tstorj.io/storj/storagenode/pieces/lazyfilewalker.(*process).run:71\n\tstorj.io/storj/storagenode/pieces/lazyfilewalker.(*Supervisor).WalkAndComputeSpaceUsedBySatellite:105\n\tstorj.io/storj/storagenode/pieces.(*Store).SpaceUsedTotalAndBySatellite:717\n\tstorj.io/storj/storagenode/pieces.(*CacheService).Run:57\n\tstorj.io/storj/private/lifecycle.(*Group).Run.func2.1:87\n\truntime/pprof.Do:51\n\tstorj.io/storj/private/lifecycle.(*Group).Run.func2:86\n\tgolang.org/x/sync/errgroup.(*Group).Go.func1:75", "Satellite ID": "1wFTAgs9DP5RSnCqKV1eLf6N9wtk4EAtmN5DpSxcs8EjT69tGE"}
2024-03-10T20:57:38Z    INFO    lazyfilewalker.used-space-filewalker    starting subprocess     {"process": "storagenode", "satelliteID": "121RTSDpyNZVcEU84Ticf2L1ntiuUimbWgfATz21tuvgk3vzoA6"}
2024-03-10T20:57:38Z    ERROR   lazyfilewalker.used-space-filewalker    failed to start subprocess      {"process": "storagenode", "satelliteID": "121RTSDpyNZVcEU84Ticf2L1ntiuUimbWgfATz21tuvgk3vzoA6", "error": "context canceled"}
2024-03-10T20:57:38Z    ERROR   pieces  failed to lazywalk space used by satellite      {"process": "storagenode", "error": "lazyfilewalker: context canceled", "errorVerbose": "lazyfilewalker: context canceled\n\tstorj.io/storj/storagenode/pieces/lazyfilewalker.(*process).run:71\n\tstorj.io/storj/storagenode/pieces/lazyfilewalker.(*Supervisor).WalkAndComputeSpaceUsedBySatellite:105\n\tstorj.io/storj/storagenode/pieces.(*Store).SpaceUsedTotalAndBySatellite:717\n\tstorj.io/storj/storagenode/pieces.(*CacheService).Run:57\n\tstorj.io/storj/private/lifecycle.(*Group).Run.func2.1:87\n\truntime/pprof.Do:51\n\tstorj.io/storj/private/lifecycle.(*Group).Run.func2:86\n\tgolang.org/x/sync/errgroup.(*Group).Go.func1:75", "Satellite ID": "121RTSDpyNZVcEU84Ticf2L1ntiuUimbWgfATz21tuvgk3vzoA6"}
2024-03-10T20:57:38Z    INFO    lazyfilewalker.used-space-filewalker    starting subprocess     {"process": "storagenode", "satelliteID": "12EayRS2V1kEsWESU9QMRseFhdxYxKicsiFmxrsLZHeLUtdps3S"}
2024-03-10T20:57:38Z    ERROR   lazyfilewalker.used-space-filewalker    failed to start subprocess      {"process": "storagenode", "satelliteID": "12EayRS2V1kEsWESU9QMRseFhdxYxKicsiFmxrsLZHeLUtdps3S", "error": "context canceled"}
2024-03-10T20:57:38Z    ERROR   pieces  failed to lazywalk space used by satellite      {"process": "storagenode", "error": "lazyfilewalker: context canceled", "errorVerbose": "lazyfilewalker: context canceled\n\tstorj.io/storj/storagenode/pieces/lazyfilewalker.(*process).run:71\n\tstorj.io/storj/storagenode/pieces/lazyfilewalker.(*Supervisor).WalkAndComputeSpaceUsedBySatellite:105\n\tstorj.io/storj/storagenode/pieces.(*Store).SpaceUsedTotalAndBySatellite:717\n\tstorj.io/storj/storagenode/pieces.(*CacheService).Run:57\n\tstorj.io/storj/private/lifecycle.(*Group).Run.func2.1:87\n\truntime/pprof.Do:51\n\tstorj.io/storj/private/lifecycle.(*Group).Run.func2:86\n\tgolang.org/x/sync/errgroup.(*Group).Go.func1:75", "Satellite ID": "12EayRS2V1kEsWESU9QMRseFhdxYxKicsiFmxrsLZHeLUtdps3S"}
2024-03-10T20:57:38Z    INFO    lazyfilewalker.used-space-filewalker    starting subprocess     {"process": "storagenode", "satelliteID": "12L9ZFwhzVpuEKMUNUqkaTLGzwY9G24tbiigLiXpmZWKwmcNDDs"}
2024-03-10T20:57:38Z    ERROR   lazyfilewalker.used-space-filewalker    failed to start subprocess      {"process": "storagenode", "satelliteID": "12L9ZFwhzVpuEKMUNUqkaTLGzwY9G24tbiigLiXpmZWKwmcNDDs", "error": "context canceled"}
2024-03-10T20:57:38Z    ERROR   pieces  failed to lazywalk space used by satellite      {"process": "storagenode", "error": "lazyfilewalker: context canceled", "errorVerbose": "lazyfilewalker: context canceled\n\tstorj.io/storj/storagenode/pieces/lazyfilewalker.(*process).run:71\n\tstorj.io/storj/storagenode/pieces/lazyfilewalker.(*Supervisor).WalkAndComputeSpaceUsedBySatellite:105\n\tstorj.io/storj/storagenode/pieces.(*Store).SpaceUsedTotalAndBySatellite:717\n\tstorj.io/storj/storagenode/pieces.(*CacheService).Run:57\n\tstorj.io/storj/private/lifecycle.(*Group).Run.func2.1:87\n\truntime/pprof.Do:51\n\tstorj.io/storj/private/lifecycle.(*Group).Run.func2:86\n\tgolang.org/x/sync/errgroup.(*Group).Go.func1:75", "Satellite ID": "12L9ZFwhzVpuEKMUNUqkaTLGzwY9G24tbiigLiXpmZWKwmcNDDs"}
2024-03-10T20:57:38Z    ERROR   piecestore:cache        error getting current used space:       {"process": "storagenode", "error": "filewalker: context canceled; filewalker: context canceled; filewalker: context canceled; filewalker: context canceled; filewalker: context canceled; filewalker: context canceled", "errorVerbose": "group:\n--- filewalker: context canceled\n\tstorj.io/storj/storagenode/pieces.(*FileWalker).WalkSatellitePieces:69\n\tstorj.io/storj/storagenode/pieces.(*FileWalker).WalkAndComputeSpaceUsedBySatellite:74\n\tstorj.io/storj/storagenode/pieces.(*Store).SpaceUsedTotalAndBySatellite:726\n\tstorj.io/storj/storagenode/pieces.(*CacheService).Run:57\n\tstorj.io/storj/private/lifecycle.(*Group).Run.func2.1:87\n\truntime/pprof.Do:51\n\tstorj.io/storj/private/lifecycle.(*Group).Run.func2:86\n\tgolang.org/x/sync/errgroup.(*Group).Go.func1:75\n--- filewalker: context canceled\n\tstorj.io/storj/storagenode/pieces.(*FileWalker).WalkSatellitePieces:69\n\tstorj.io/storj/storagenode/pieces.(*FileWalker).WalkAndComputeSpaceUsedBySatellite:74\n\tstorj.io/storj/storagenode/pieces.(*Store).SpaceUsedTotalAndBySatellite:726\n\tstorj.io/storj/storagenode/pieces.(*CacheService).Run:57\n\tstorj.io/storj/private/lifecycle.(*Group).Run.func2.1:87\n\truntime/pprof.Do:51\n\tstorj.io/storj/private/lifecycle.(*Group).Run.func2:86\n\tgolang.org/x/sync/errgroup.(*Group).Go.func1:75\n--- filewalker: context canceled\n\tstorj.io/storj/storagenode/pieces.(*FileWalker).WalkSatellitePieces:69\n\tstorj.io/storj/storagenode/pieces.(*FileWalker).WalkAndComputeSpaceUsedBySatellite:74\n\tstorj.io/storj/storagenode/pieces.(*Store).SpaceUsedTotalAndBySatellite:726\n\tstorj.io/storj/storagenode/pieces.(*CacheService).Run:57\n\tstorj.io/storj/private/lifecycle.(*Group).Run.func2.1:87\n\truntime/pprof.Do:51\n\tstorj.io/storj/private/lifecycle.(*Group).Run.func2:86\n\tgolang.org/x/sync/errgroup.(*Group).Go.func1:75\n--- filewalker: context canceled\n\tstorj.io/storj/storagenode/pieces.(*FileWalker).WalkSatellitePieces:69\n\tstorj.io/storj/storagenode/pieces.(*FileWalker).WalkAndComputeSpaceUsedBySatellite:74\n\tstorj.io/storj/storagenode/pieces.(*Store).SpaceUsedTotalAndBySatellite:726\n\tstorj.io/storj/storagenode/pieces.(*CacheService).Run:57\n\tstorj.io/storj/private/lifecycle.(*Group).Run.func2.1:87\n\truntime/pprof.Do:51\n\tstorj.io/storj/private/lifecycle.(*Group).Run.func2:86\n\tgolang.org/x/sync/errgroup.(*Group).Go.func1:75\n--- filewalker: context canceled\n\tstorj.io/storj/storagenode/pieces.(*FileWalker).WalkSatellitePieces:69\n\tstorj.io/storj/storagenode/pieces.(*FileWalker).WalkAndComputeSpaceUsedBySatellite:74\n\tstorj.io/storj/storagenode/pieces.(*Store).SpaceUsedTotalAndBySatellite:726\n\tstorj.io/storj/storagenode/pieces.(*CacheService).Run:57\n\tstorj.io/storj/private/lifecycle.(*Group).Run.func2.1:87\n\truntime/pprof.Do:51\n\tstorj.io/storj/private/lifecycle.(*Group).Run.func2:86\n\tgolang.org/x/sync/errgroup.(*Group).Go.func1:75\n--- filewalker: context canceled\n\tstorj.io/storj/storagenode/pieces.(*FileWalker).WalkSatellitePieces:69\n\tstorj.io/storj/storagenode/pieces.(*FileWalker).WalkAndComputeSpaceUsedBySatellite:74\n\tstorj.io/storj/storagenode/pieces.(*Store).SpaceUsedTotalAndBySatellite:726\n\tstorj.io/storj/storagenode/pieces.(*CacheService).Run:57\n\tstorj.io/storj/private/lifecycle.(*Group).Run.func2.1:87\n\truntime/pprof.Do:51\n\tstorj.io/storj/private/lifecycle.(*Group).Run.func2:86\n\tgolang.org/x/sync/errgroup.(*Group).Go.func1:75"}
Error: piecestore monitor: disk space requirement not met
2024-03-10 20:57:38,988 INFO exited: storagenode (exit status 1; not expected)
2024-03-10 20:57:39,989 INFO spawned: 'storagenode' with pid 37
2024-03-10 20:57:39,990 WARN received SIGQUIT indicating exit request
2024-03-10 20:57:39,990 INFO waiting for storagenode, processes-exit-eventlistener, storagenode-updater to die
2024-03-10T20:57:39Z    INFO    Got a signal from the OS: "terminated"  {"Process": "storagenode-updater"}
2024-03-10 20:57:39,991 INFO stopped: storagenode-updater (exit status 0)
2024-03-10 20:57:39,992 INFO stopped: storagenode (terminated by SIGTERM)
2024-03-10 20:57:39,992 INFO stopped: processes-exit-eventlistener (terminated by SIGTERM)
2024-03-10 20:57:41,719 INFO Set uid to user 0 succeeded
2024-03-10 20:57:41,726 INFO RPC interface 'supervisor' initialized
2024-03-10 20:57:41,726 INFO supervisord started with pid 1
2024-03-10 20:57:42,728 INFO spawned: 'processes-exit-eventlistener' with pid 10
2024-03-10 20:57:42,729 INFO spawned: 'storagenode' with pid 11
2024-03-10 20:57:42,730 INFO spawned: 'storagenode-updater' with pid 12
2024-03-10T20:57:42Z    INFO    Configuration loaded    {"Process": "storagenode-updater", "Location": "/app/config/config.yaml"}
2024-03-10T20:57:42Z    INFO    Invalid configuration file key  {"Process": "storagenode-updater", "Key": "storage.allocated-bandwidth"}
2024-03-10T20:57:42Z    INFO    Invalid configuration file key  {"Process": "storagenode-updater", "Key": "healthcheck.enabled"}
2024-03-10T20:57:42Z    INFO    Invalid configuration file key  {"Process": "storagenode-updater", "Key": "server.address"}
2024-03-10T20:57:42Z    INFO    Invalid configuration file key  {"Process": "storagenode-updater", "Key": "operator.wallet"}
2024-03-10T20:57:42Z    INFO    Invalid configuration file key  {"Process": "storagenode-updater", "Key": "operator.email"}
2024-03-10T20:57:42Z    INFO    Invalid configuration file key  {"Process": "storagenode-updater", "Key": "console.address"}
2024-03-10T20:57:42Z    INFO    Invalid configuration file key  {"Process": "storagenode-updater", "Key": "healthcheck.details"}
2024-03-10T20:57:42Z    INFO    Invalid configuration file key  {"Process": "storagenode-updater", "Key": "storage.allocated-disk-space"}
2024-03-10T20:57:42Z    INFO    Invalid configuration file key  {"Process": "storagenode-updater", "Key": "operator.wallet-features"}
2024-03-10T20:57:42Z    INFO    Invalid configuration file key  {"Process": "storagenode-updater", "Key": "contact.external-address"}
2024-03-10T20:57:42Z    INFO    Invalid configuration file key  {"Process": "storagenode-updater", "Key": "server.private-address"}
2024-03-10T20:57:42Z    INFO    Invalid configuration file value for key        {"Process": "storagenode-updater", "Key": "log.level"}
2024-03-10T20:57:42Z    INFO    Configuration loaded    {"process": "storagenode", "Location": "/app/config/config.yaml"}
2024-03-10T20:57:42Z    INFO    Anonymized tracing enabled      {"Process": "storagenode-updater"}
2024-03-10T20:57:42Z    INFO    Anonymized tracing enabled      {"process": "storagenode"}
2024-03-10T20:57:42Z    INFO    Running on version      {"Process": "storagenode-updater", "Service": "storagenode-updater", "Version": "v1.95.1"}
2024-03-10T20:57:42Z    INFO    Operator email  {"process": "storagenode", "Address": "7437493@gmail.com"}
2024-03-10T20:57:42Z    INFO    Operator wallet {"process": "storagenode", "Address": "0xE158e01cDb77F9f220d8359335Fe0b75799829AA"}
2024-03-10T20:57:42Z    INFO    Downloading versions.   {"Process": "storagenode-updater", "Server Address": "https://version.storj.io"}
2024-03-10T20:57:43Z    INFO    server  kernel support for tcp fast open unknown        {"process": "storagenode"}

This is 10Tb node, which is almost full.

I am going to very accurate with any manipulation, hope to recover it. HDD is quite new and have no bad sectors

I did sfc /scannow - no mistakes

And dism /Online /Cleanup-Image /CheckHealt - something were fixed

There are no mistakes on the drive:

Drive were not optimised, and I did defragmentation:

It is ongoing… I will update message, when ready…

Alexey · March 11, 2024, 3:13am

They are consequence of this error:

and this exit signal:

You need to stop this node and move databases back except bandwidth.db, then start the node.

Aka985 · March 11, 2024, 8:13am

Thanks. Now it works. I have 3 question more:

What is a best frequency of defragmintation? I setup ones a week for this 10 Tb node. Is it fine? Shall I setup same for all size of HDDs?
On pic below, node shows 8.04Tb used, but in properties it is 8.74. There are no other information except Вещко on this HDD. What can be a problem?

image796×522 27.4 KB
Some of full nodes never get full, about 5% is always free. On some other node it is really full, is it normal, or I need setup?

Regards,
Alexander

Alexey · March 11, 2024, 8:43am

the default one, which Windows configure automatically.

different measure units (the node uses SI measure units (base 10), Windows uses binary measure units (base 2), but shows them wrong - it should be 8.74 TiB, not TB), so you have 9.61TB used in SI units
Likely filewalkers have had errors or you have corrupted not only this database,
see Disk usage discrepancy?

Aka985 · March 11, 2024, 6:59pm

I will check. What about question 3?