Looks like the machine behind my nodes had a power failure recently. 4/5 nodes came up fine. The last one is failing to start up, it logs successfully starting up two satellites for hashstore then logs this and aborts:
failure during run {"Process": "storagenode", "error": "Failed to create storage node peer: hashstore: logSlots calculation mismatch: size=34603008 logSlots=19\n\tstorj.io/storj/storagenode/hashstore.OpenHashtbl:116\n\tstorj.io/storj/storagenode/hashstore.OpenTable:121\n\tstorj.io/storj/storagenode/hashstore.NewStore:258\n\tstorj.io/storj/storagenode/hashstore.New:93\n\tstorj.io/storj/storagenode/piecestore.(*HashStoreBackend).getDB:248\n\tstorj.io/storj/storagenode/piecestore.NewHashStoreBackend:114\n\tstorj.io/storj/storagenode.New:598\n\tmain.cmdRun:84\n\tmain.newRunCmd.func1:33\n\tstorj.io/common/process.cleanup.func1.4:392\n\tstorj.io/common/process.cleanup.func1:410\n\tgithub.com/spf13/cobra.(*Command).execute:985\n\tgithub.com/spf13/cobra.(*Command).ExecuteC:1117\n\tgithub.com/spf13/cobra.(*Command).Execute:1041\n\tstorj.io/common/process.ExecWithCustomOptions:112\n\tmain.main:34\n\truntime.main:272", "errorVerbose": "Failed to create storage node peer: hashstore: logSlots calculation mismatch: size=34603008 logSlots=19\n\tstorj.io/storj/storagenode/hashstore.OpenHashtbl:116\n\tstorj.io/storj/storagenode/hashstore.OpenTable:121\n\tstorj.io/storj/storagenode/hashstore.NewStore:258\n\tstorj.io/storj/storagenode/hashstore.New:93\n\tstorj.io/storj/storagenode/piecestore.(*HashStoreBackend).getDB:248\n\tstorj.io/storj/storagenode/piecestore.NewHashStoreBackend:114\n\tstorj.io/storj/storagenode.New:598\n\tmain.cmdRun:84\n\tmain.newRunCmd.func1:33\n\tstorj.io/common/process.cleanup.func1.4:392\n\tstorj.io/common/process.cleanup.func1:410\n\tgithub.com/spf13/cobra.(*Command).execute:985\n\tgithub.com/spf13/cobra.(*Command).ExecuteC:1117\n\tgithub.com/spf13/cobra.(*Command).Execute:1041\n\tstorj.io/common/process.ExecWithCustomOptions:112\n\tmain.main:34\n\truntime.main:272\n\tmain.cmdRun:86\n\tmain.newRunCmd.func1:33\n\tstorj.io/common/process.cleanup.func1.4:392\n\tstorj.io/common/process.cleanup.func1:410\n\tgithub.com/spf13/cobra.(*Command).execute:985\n\tgithub.com/spf13/cobra.(*Command).ExecuteC:1117\n\tgithub.com/spf13/cobra.(*Command).Execute:1041\n\tstorj.io/common/process.ExecWithCustomOptions:112\n\tmain.main:34\n\truntime.main:272"}
I use ZFS for the filesystem under all of my nodes, but ironically this is the only one that’s redundant - the others I just use it for consistency in system config, but 1 drive at a time. From ZFS’ perspective, there are no data errors.
Since this doesn’t seem to be a problem with all of the node’s data but just the one satellite, yet the error is causing the node to refuse to start up at all, is there any suggestion for how to bring the node up to serve the satellites it doesn’t have corrupted data for?
Put the one satellite on the exclusion list in config and restart. Wait for an official response though. I don’t know if it will disable that sat permanently or temporary.
For ex. to disable the Saltlake sat, put this in config, or edit the existing line:
# list of trust exclusions
storage2.trust.exclusions: "1wFTAgs9DP5RSnCqKV1eLf6N9wtk4EAtmN5DpSxcs8EjT69tGE@saltlake.tardigrade.io:7777"
However, a ZFS system with CoW (copy-on-write) should lower the risk quite significantly - and perhaps that is why only 1 one with 1 sat is experiencing this issue. Imagine how the situation for @bryanpendleton would be if this was EXT4 partitions with Hashstore
You may also use docker to build the local image with the binary to do not install all developers tools, like it is described there (you need to replace the command, of course, to build this tool, not benchmarks):
Please note, the tool would place the generated hashtables in the current directory, so you would need to move them to the proper folder, or you may start this command from the proper directory as well, i.e.
cd /mnt/storj/storagenode/storage/hashstore/12L9ZFwhzVpuEKMUNUqkaTLGzwY9G24tbiigLiXpmZWKwmcNDDs/s0
mv meta meta.bak
mkdir meta
cd meta
~/bin/write-hashtbl /mnt/storj/storagenode/storage/hashstore/12L9ZFwhzVpuEKMUNUqkaTLGzwY9G24tbiigLiXpmZWKwmcNDDs/s0