Hashstore error preventing node restart

Ottetal · October 23, 2025, 5:52pm

I just looked it up.

All products must (!) come with a two year warranty, and that gets reset, should the product go through and RMA, and will continue to do so each time the product goes through an RMA process

Roxor · October 23, 2025, 5:59pm

Is info like this out-of-date maybe?

alpharabbit · October 23, 2025, 6:00pm

Germany. According to StorRepair GmbH Service Portal replacement serial numbers are still under warranty as expected. But it doesn’t show when warranty ends.

snorkel · October 23, 2025, 6:54pm

@alpharabbit
You can try the hdparm utility in Linux for firmware upgrade. I can provide you with the command, but in 2 hours…
It works on Seagate Exos drives. I don’t know about Toshiba.
On some forums they suggest using OpenSeaChest that is made for Seagate. They say it works.
https://github.com/Seagate/openSeaChest
I didn’t managed to update with it; it used to work, but newer versions have some security requirements that interfere with the update. Maybe some Linux wizzard can make it work, but for me, hdparm works pretty well for Exos.

gingerbread233 · October 28, 2025, 6:53pm

I filtered for 0 byte files, but in one case this didn’t do the trick, and still throws this error:

Counting 01/log-0000000000000001-00000000...
Counting 18/log-0000000000000018-00000000...
Counting 19/log-0000000000000019-00004f9e...
Counting 1d/log-000000000000001d-00000000...
Counting 20/log-0000000000000020-00000000...
Counting 21/log-0000000000000021-00000000...
Counting 23/log-0000000000000023-00000000...
Counting 24/log-0000000000000024-00000000...
Counting 26/log-0000000000000026-00004fa2...
Counting 27/log-0000000000000027-00004fa5...
Counting 29/log-0000000000000029-00000000...
Counting 2a/log-000000000000002a-00004fa6...
Counting 2b/log-000000000000002b-00000000...
Counting 2c/log-000000000000002c-00004f9d...
Counting 2d/log-000000000000002d-00000000...
Counting 2e/log-000000000000002e-00004fa7...
Counting 2f/log-000000000000002f-00000000...
Counting 31/log-0000000000000031-00000000...
Counting 32/log-0000000000000032-00000000...
Counting 33/log-0000000000000033-00004fa8...
Counting 34/log-0000000000000034-00004fab...
Counting 35/log-0000000000000035-00000000...
Counting 36/log-0000000000000036-00004fad...
Counting 37/log-0000000000000037-00000000...
Counting 38/log-0000000000000038-00004fae...
Counting 39/log-0000000000000039-00000000...
unexpected fault address 0x7f2dcdf33200
fatal error: fault
[signal SIGBUS: bus error code=0x2 addr=0x7f2dcdf33200 pc=0x56b8f2]

goroutine 1 gp=0xc000002380 m=0 mp=0x913420 [running]:
runtime.throw({0x6cdb73?, 0x69e9420f69e9420f?})
        /usr/local/go/src/runtime/panic.go:1101 +0x48 fp=0xc00018d5f8 sp=0xc00018d5c8 pc=0x46dc68
runtime.sigpanic()
        /usr/local/go/src/runtime/signal_unix.go:922 +0x10a fp=0xc00018d658 sp=0xc00018d5f8 pc=0x46f3ca
storj.io/storj/storagenode/hashstore.(*Record).ReadFrom(0x983ddc7d11b1fc04?, 0x361b104f2171e01a?)
        /root/storj/storagenode/hashstore/record.go:151 +0x12 fp=0xc00018d698 sp=0xc00018d658 pc=0x56b8f2
main.(*file).Record(...)
        /root/storj/cmd/write-hashtbl/file_linux.go:59
main.(*cmdRoot).iterateRecords(0xc0001344c0, {0x732f60, 0xc00012fdd0}, {0xc000148980, 0x20}, 0x1, 0xc00018d928)
        /root/storj/cmd/write-hashtbl/main.go:157 +0x1bf fp=0xc00018d8e8 sp=0xc00018d698 pc=0x5f32bf
main.(*cmdRoot).countRecords(0x730d40?, {0x732f60?, 0xc00012fdd0?}, {0xc000148980?, 0xc00018db88?})
        /root/storj/cmd/write-hashtbl/main.go:212 +0x4c fp=0xc00018d948 sp=0xc00018d8e8 pc=0x5f3e8c
main.(*cmdRoot).Execute(0xc0001344c0, {0x732f60, 0xc00012fdd0})
        /root/storj/cmd/write-hashtbl/main.go:79 +0x29b fp=0xc00018dbd8 sp=0xc00018d948 pc=0x5f265b
github.com/zeebo/clingy.(*Environment).dispatchDesc(0xc000120de0, {0x732ef0, 0x9329e0}, 0xc000136180, {{0x0, 0x0}, {0x0, 0x0}, {0x0, 0x0}, ...})
        /root/go/pkg/mod/github.com/zeebo/clingy@v0.0.0-20230602044025-906be850f10d/run.go:129 +0x6dd fp=0xc00018dce0 sp=0xc00018dbd8 pc=0x57381d
github.com/zeebo/clingy.Environment.Run({{0x731900, 0xc0001344c0}, {0x6cfc9e, 0xd}, {0xc00012c050, 0x1, 0x1}, 0x0, 0x0, 0x6e55f8, ...}, ...)
        /root/go/pkg/mod/github.com/zeebo/clingy@v0.0.0-20230602044025-906be850f10d/run.go:41 +0x198 fp=0xc00018dde0 sp=0xc00018dce0 pc=0x572e18
main.main()
        /root/storj/cmd/write-hashtbl/main.go:29 +0x148 fp=0xc00018df50 sp=0xc00018dde0 pc=0x5f1c08
runtime.main()
        /usr/local/go/src/runtime/proc.go:283 +0x28b fp=0xc00018dfe0 sp=0xc00018df50 pc=0x43c42b
runtime.goexit({})
        /usr/local/go/src/runtime/asm_amd64.s:1700 +0x1 fp=0xc00018dfe8 sp=0xc00018dfe0 pc=0x474a21

goroutine 2 gp=0xc0000028c0 m=nil [force gc (idle)]:
runtime.gopark(0x0?, 0x0?, 0x0?, 0x0?, 0x0?)
        /usr/local/go/src/runtime/proc.go:435 +0xce fp=0xc00005efa8 sp=0xc00005ef88 pc=0x46dd8e
runtime.goparkunlock(...)
        /usr/local/go/src/runtime/proc.go:441
runtime.forcegchelper()
        /usr/local/go/src/runtime/proc.go:348 +0xb3 fp=0xc00005efe0 sp=0xc00005efa8 pc=0x43c773
runtime.goexit({})
        /usr/local/go/src/runtime/asm_amd64.s:1700 +0x1 fp=0xc00005efe8 sp=0xc00005efe0 pc=0x474a21
created by runtime.init.7 in goroutine 1
        /usr/local/go/src/runtime/proc.go:336 +0x1a

goroutine 3 gp=0xc000002e00 m=nil [GC sweep wait]:
runtime.gopark(0x0?, 0x0?, 0x0?, 0x0?, 0x0?)
        /usr/local/go/src/runtime/proc.go:435 +0xce fp=0xc00005f780 sp=0xc00005f760 pc=0x46dd8e
runtime.goparkunlock(...)
        /usr/local/go/src/runtime/proc.go:441
runtime.bgsweep(0xc00007e000)
        /usr/local/go/src/runtime/mgcsweep.go:276 +0x94 fp=0xc00005f7c8 sp=0xc00005f780 pc=0x427494
runtime.gcenable.gowrap1()
        /usr/local/go/src/runtime/mgc.go:204 +0x25 fp=0xc00005f7e0 sp=0xc00005f7c8 pc=0x41b965
runtime.goexit({})
        /usr/local/go/src/runtime/asm_amd64.s:1700 +0x1 fp=0xc00005f7e8 sp=0xc00005f7e0 pc=0x474a21
created by runtime.gcenable in goroutine 1
        /usr/local/go/src/runtime/mgc.go:204 +0x66

goroutine 4 gp=0xc000002fc0 m=nil [GC scavenge wait]:
runtime.gopark(0xc00007e000?, 0x72e3e0?, 0x1?, 0x0?, 0xc000002fc0?)
        /usr/local/go/src/runtime/proc.go:435 +0xce fp=0xc00005ff78 sp=0xc00005ff58 pc=0x46dd8e
runtime.goparkunlock(...)
        /usr/local/go/src/runtime/proc.go:441
runtime.(*scavengerState).park(0x912200)
        /usr/local/go/src/runtime/mgcscavenge.go:425 +0x49 fp=0xc00005ffa8 sp=0xc00005ff78 pc=0x424f49
runtime.bgscavenge(0xc00007e000)
        /usr/local/go/src/runtime/mgcscavenge.go:653 +0x3c fp=0xc00005ffc8 sp=0xc00005ffa8 pc=0x4254bc
runtime.gcenable.gowrap2()
        /usr/local/go/src/runtime/mgc.go:205 +0x25 fp=0xc00005ffe0 sp=0xc00005ffc8 pc=0x41b905
runtime.goexit({})
        /usr/local/go/src/runtime/asm_amd64.s:1700 +0x1 fp=0xc00005ffe8 sp=0xc00005ffe0 pc=0x474a21
created by runtime.gcenable in goroutine 1
        /usr/local/go/src/runtime/mgc.go:205 +0xa5

goroutine 17 gp=0xc000102380 m=nil [finalizer wait]:
runtime.gopark(0x933ce0?, 0x490013?, 0x78?, 0xe6?, 0x413c3e?)
        /usr/local/go/src/runtime/proc.go:435 +0xce fp=0xc00005e630 sp=0xc00005e610 pc=0x46dd8e
runtime.runfinq()
        /usr/local/go/src/runtime/mfinal.go:196 +0x107 fp=0xc00005e7e0 sp=0xc00005e630 pc=0x41a927
runtime.goexit({})
        /usr/local/go/src/runtime/asm_amd64.s:1700 +0x1 fp=0xc00005e7e8 sp=0xc00005e7e0 pc=0x474a21
created by runtime.createfing in goroutine 1
        /usr/local/go/src/runtime/mfinal.go:166 +0x3d

goroutine 18 gp=0xc000102540 m=nil [chan receive]:
runtime.gopark(0x0?, 0x0?, 0x0?, 0x0?, 0x0?)
        /usr/local/go/src/runtime/proc.go:435 +0xce fp=0xc00005a718 sp=0xc00005a6f8 pc=0x46dd8e
runtime.chanrecv(0xc000116150, 0x0, 0x1)
        /usr/local/go/src/runtime/chan.go:664 +0x445 fp=0xc00005a790 sp=0xc00005a718 pc=0x40d365
runtime.chanrecv1(0x0?, 0x0?)
        /usr/local/go/src/runtime/chan.go:506 +0x12 fp=0xc00005a7b8 sp=0xc00005a790 pc=0x40cf12
runtime.unique_runtime_registerUniqueMapCleanup.func2(...)
        /usr/local/go/src/runtime/mgc.go:1797
runtime.unique_runtime_registerUniqueMapCleanup.gowrap1()
        /usr/local/go/src/runtime/mgc.go:1800 +0x2f fp=0xc00005a7e0 sp=0xc00005a7b8 pc=0x41eaaf
runtime.goexit({})
        /usr/local/go/src/runtime/asm_amd64.s:1700 +0x1 fp=0xc00005a7e8 sp=0xc00005a7e0 pc=0x474a21
created by unique.runtime_registerUniqueMapCleanup in goroutine 1
        /usr/local/go/src/runtime/mgc.go:1795 +0x79

How can I fix this?

Alexey · October 29, 2025, 6:27am

Likely only to delete the log file, which cannot be recovered.

gingerbread233 · October 29, 2025, 9:54am

What do you mean? Which log file? Should I delete all log files which are smaller than 1GB, and live with it? I have to fix them ASAP, because my score is already at 70%.

Ambifacient · October 29, 2025, 9:49pm

Delete the offending log file, in this case 39/log-0000000000000039-00000000. Maybe back it up just in case, but chances are this logfile is corrupt in some way that the regeneration process cannot complete. You’d lose <= 1GB of data, but for a sufficiently large node shouldn’t affect audit too much.

Alexey · October 30, 2025, 3:28am

Which score? If the audit score is below 96% it means that the node is already disqualified on that satellite. I may guess that you mean the online score, in that case you just need to keep your node online for the next 30 days to fully recover this score.

mattventura · February 5, 2026, 12:32pm

Just hit this issue myself. On ZFS, healthy drives, not a hardware issue. Is there a known fix yet? I tried adding all four satellites to trust exclusion temporarily as suggested in the thread, but it still fails to start up, so that doesn’t seem to help pinpoint what satellite’s data went bad.

Also, is there a way to migrate back off of hashstore? Having a node be unable to communicate with any satellite due to a single corrupted file (plus no indication of which file) is a major regression.

gingerbread233 · February 5, 2026, 3:05pm

Which file is corrupted? What does the logs say? You can delete the faulty log file and create a new hashtable with the go script.

alpharabbit · February 5, 2026, 3:10pm

Do this only as a last resort. With zfs you can copy the file loosing only the damaged blocks. Write-hashtbl can handle such files and save most of the pieces.

mattventura · February 5, 2026, 4:24pm

I don’t know which file is corrupted, because the error really doesn’t provide anything useful.

It mentions the 121RTS… satellite prior to that, but I already did the write-hashtbl on that one and it didn’t seem to fix it.

Roxor · February 5, 2026, 4:33pm

This kinda stuff scares me. Piecestore lets you just toss out tiny .sj1 files when there are errors: usually so few audits don’t even notice. But errors on hashstore seem to blow large holes into nodes?

alpharabbit · February 5, 2026, 4:55pm

Have you tried log level DEBUG?

arrogantrabbit · February 5, 2026, 6:36pm

Well, this is a cost of reinventing a filesystem/database, that storj decided is worth paying, to improve certain use cases: Trading decades of filesystem reliability testing on a vast array of devices for… what was it? fast-er deletes? with the database/filesystem code that received a year worth of testing on what, few thousand storage nodes?

Toyoo · February 5, 2026, 11:50pm

Here we have a case of a “reliable file system that was tested on a vast array of devices” that can’t reliably store large files, which is actually a simpler case to deal with from the file system’s point of view. It would fail with small files as well.

mattventura · February 6, 2026, 11:44pm

I got it back up, I just brute force rebuilt hash tables until it worked.

Still, if someone is a fatal error that prevents node start completely, I’d expect the error to contain enough information to be able to actually address the problem…

Alexey · February 7, 2026, 9:18am

It should not be a FATAL Unrecoverable error in a new version as far as I know.