Node fails to start, error hashtbl-000000000000000b

I have a node that I run under Docker that won’t start due to the following error:

-25T13:55:27Z	ERROR	failure during run	{"Process": "storagenode", "error": "Failed to create storage node peer: hashstore: read config/storage/hashstore/121RTSDpyNZVcEU84Ticf2L1ntiuUimbWgfATz21tuvgk3vzoA6/s1/meta/hashtbl-000000000000000b: input/output error\n\tstorj.io/storj/storagenode/hashstore.(*roBigPageCache).ReadRecord:622\n\tstorj.io/storj/storagenode/hashstore.(*HashTbl).Range:293\n\tstorj.io/storj/storagenode/hashstore.OpenHashTbl:175\n\tstorj.io/storj/storagenode/hashstore.OpenTable:117\n\tstorj.io/storj/storagenode/hashstore.NewStore:276\n\tstorj.io/storj/storagenode/hashstore.New:100\n\tstorj.io/storj/storagenode/piecestore.(*HashStoreBackend).getDB:252\n\tstorj.io/storj/storagenode/piecestore.NewHashStoreBackend:117\n\tstorj.io/storj/storagenode.New:604\n\tmain.cmdRun:84\n\tmain.newRunCmd.func1:33\n\tstorj.io/common/process.InitBeforeExecute.func1.2:389\n\tstorj.io/common/process.InitBeforeExecute.func1:407\n\tgithub.com/spf13/cobra.(*Command).execute:985\n\tgithub.com/spf13/cobra.(*Command).ExecuteC:1117\n\tgithub.com/spf13/cobra.(*Command).Execute:1041\n\tstorj.io/common/process.ExecWithCustomOptions:115\n\tmain.main:34\n\truntime.main:283", "errorVerbose": "Failed to create storage node peer: hashstore: read config/storage/hashstore/121RTSDpyNZVcEU84Ticf2L1ntiuUimbWgfATz21tuvgk3vzoA6/s1/meta/hashtbl-000000000000000b: input/output error\n\tstorj.io/storj/storagenode/hashstore.(*roBigPageCache).ReadRecord:622\n\tstorj.io/storj/storagenode/hashstore.(*HashTbl).Range:293\n\tstorj.io/storj/storagenode/hashstore.OpenHashTbl:175\n\tstorj.io/storj/storagenode/hashstore.OpenTable:117\n\tstorj.io/storj/storagenode/hashstore.NewStore:276\n\tstorj.io/storj/storagenode/hashstore.New:100\n\tstorj.io/storj/storagenode/piecestore.(*HashStoreBackend).getDB:252\n\tstorj.io/storj/storagenode/piecestore.NewHashStoreBackend:117\n\tstorj.io/storj/storagenode.New:604\n\tmain.cmdRun:84\n\tmain.newRunCmd.func1:33\n\tstorj.io/common/process.InitBeforeExecute.func1.2:389\n\tstorj.io/common/process.InitBeforeExecute.func1:407\n\tgithub.com/spf13/cobra.(*Command).execute:985\n\tgithub.com/spf13/cobra.(*Command).ExecuteC:1117\n\tgithub.com/spf13/cobra.(*Command).Execute:1041\n\tstorj.io/common/process.ExecWithCustomOptions:115\n\tmain.main:34\n\truntime.main:283\n\tmain.cmdRun:86\n\tmain.newRunCmd.func1:33\n\tstorj.io/common/process.InitBeforeExecute.func1.2:389\n\tstorj.io/common/process.InitBeforeExecute.func1:407\n\tgithub.com/spf13/cobra.(*Command).execute:985\n\tgithub.com/spf13/cobra.(*Command).ExecuteC:1117\n\tgithub.com/spf13/cobra.(*Command).Execute:1041\n\tstorj.io/common/process.ExecWithCustomOptions:115\n\tmain.main:34\n\truntime.main:283"}

Error: Failed to create storage node peer: hashstore: read config/storage/hashstore/121RTSDpyNZVcEU84Ticf2L1ntiuUimbWgfATz21tuvgk3vzoA6/s1/meta/hashtbl-000000000000000b: input/output error

I have checked the permissions for that file and they seem to be correct, and everything should be able to run without any problems, but it seems that something has broken.

If I delete the file, it complains that it doesn’t exist, so I don’t know how to proceed to recover this node.

Someone can help me?

The hastable file is likely damaged. Use write-hashtbl to create a new one:

seems that this should work but im getting this error

Counting /hashstore/26/log-0000000000000026-00004fb6…
Counting /hashstore/28/log-0000000000000028-00000000…
Counting /hashstore/2c/log-000000000000002c-00004fb9…
Counting /hashstore/2e/log-000000000000002e-00000000…
Counting /hashstore/2f/log-000000000000002f-00004fba…
Counting /hashstore/31/log-0000000000000031-00000000…
Counting /hashstore/32/log-0000000000000032-00004fb7…
Counting /hashstore/33/log-0000000000000033-00004fbc…
Counting /hashstore/34/log-0000000000000034-00000000…
Counting /hashstore/37/log-0000000000000037-00000000…
Counting /hashstore/38/log-0000000000000038-00004fc3…
Counting /hashstore/3a/log-000000000000003a-00004fbd…
platform: invalid argument
storj.io/storj/storagenode/hashstore/platform.mmap:30
storj.io/storj/storagenode/hashstore/platform.Mmap:16
main.openFile:37
main.(*cmdRoot).iterateRecords:148
main.(*cmdRoot).countRecords:212
main.(*cmdRoot).Execute:79
github.com/zeebo/clingy.(*Environment).dispatchDesc:129
github.com/zeebo/clingy.Environment.Run:41
main.main:29
runtime.main:285

im runing this guide

wait, I try to run the node and now starts fine :smile:

When recreating the Hashtable, you have to scan for 0 byte sized logs, and delete them, the start the script. If it fails at a certain point, take a look, at which file this happened, you can back it up and delete it from the corresponding folder, then start the script again. Hashtable seems to perform faster but I hope fixing hashtables over time wont be the status quo. I had it 1 or 2 times, the hashtable got corrupted, and had to fix it.

It may not rebuilt all hashtable, so it would start to fail audits for not scanned logs due to “file not found”. I would recommend to finish the rebuild.

So, is safe to delete the files like this?

/hashstore/28/log-0000000000000028-00000000

If its size is 0, then yes (back it up, just in case).

now seems that in s1 folder works fine

s1# docker run -it --rm -v ${PWD}:/hashstore -v ${PWD}/meta:/meta storj-write-hashtbl write-hashtbl /hashstore
Counting /hashstore/0f/log-000000000000000f-00000000...
Counting /hashstore/12/log-0000000000000012-00000000...
Counting /hashstore/19/log-0000000000000019-00000000...
Counting /hashstore/1b/log-000000000000001b-00000000...
Counting /hashstore/1f/log-000000000000001f-00000000...
Counting /hashstore/23/log-0000000000000023-00000000...
Counting /hashstore/25/log-0000000000000025-00000000...
Counting /hashstore/29/log-0000000000000029-00000000...
Counting /hashstore/2d/log-000000000000002d-00000000...
Counting /hashstore/2e/log-000000000000002e-00004fc3...
Counting /hashstore/2f/log-000000000000002f-00004fd3...
Counting /hashstore/30/log-0000000000000030-00004fc4...
Counting /hashstore/31/log-0000000000000031-00004fc7...
Counting /hashstore/32/log-0000000000000032-00000000...
Counting /hashstore/33/log-0000000000000033-00004fc9...
Record count=23004
Using logSlots=16
Processing /hashstore/0f/log-000000000000000f-00000000...
Processing /hashstore/12/log-0000000000000012-00000000...
Processing /hashstore/19/log-0000000000000019-00000000...
Processing /hashstore/1b/log-000000000000001b-00000000...
Processing /hashstore/1f/log-000000000000001f-00000000...
Processing /hashstore/23/log-0000000000000023-00000000...
Processing /hashstore/25/log-0000000000000025-00000000...
Processing /hashstore/29/log-0000000000000029-00000000...
Processing /hashstore/2d/log-000000000000002d-00000000...
Processing /hashstore/2e/log-000000000000002e-00004fc3...
Processing /hashstore/2f/log-000000000000002f-00004fd3...
Processing /hashstore/30/log-0000000000000030-00004fc4...
Processing /hashstore/31/log-0000000000000031-00004fc7...
Processing /hashstore/32/log-0000000000000032-00000000...
Processing /hashstore/33/log-0000000000000033-00004fc9...

but for s0 no

docker run -it --rm -v ${PWD}:/hashstore -v ${PWD}/meta:/meta storj-write-hashtbl write-hashtbl /hashstore
Counting /hashstore/28/log-0000000000000028-00000000...
Counting /hashstore/34/log-0000000000000034-00000000...
Counting /hashstore/37/log-0000000000000037-00000000...
Counting /hashstore/38/log-0000000000000038-00004fc3...
Counting /hashstore/3a/log-000000000000003a-00004fbd...
platform: invalid argument
        storj.io/storj/storagenode/hashstore/platform.mmap:30
        storj.io/storj/storagenode/hashstore/platform.Mmap:16
        main.openFile:37
        main.(*cmdRoot).iterateRecords:148
        main.(*cmdRoot).countRecords:212
        main.(*cmdRoot).Execute:79
        github.com/zeebo/clingy.(*Environment).dispatchDesc:129
        github.com/zeebo/clingy.Environment.Run:41
        main.main:29
        runtime.main:285

Even if it’s not 0 bytes, you can delete it (you can still back it up), since it’s corrupted and wont benefit you in any case. This small data loss wont be an issue for the audit score. I deleted one or two non 0 byte files, and my audit score did not drop.

Delete the file in the corresponding folder

Counting /hashstore/3a/log-000000000000003a-00004fbd...

Even if its not 0 bytes, it seem to be corrupted.