Hello, I have a node with a disk that is clearly dying. It has about 2100 reallocated sectors, and I think it’s going up.
I recently restarted it, had to do some changes to the config file. And now it won’t start up. I get this error:
And I don’t know how to “fix” this, and get it back up and running.
I am considering just completely start over, it’s a relatively new node. But if any of you know if this error is fixable, saving the node could be nice.
Search the forum for “write-hashtbl” threads. It’s a Storj repair utility that can go backwards from all the hashstore data files and recreate the contents of that s0\meta directory. It will probably take hours to run (as nodes often have a lot of hashstore data now)
(Edit: I just saw your latest 10x18TB build video: are you expanding for Storj?)
Note that if the disk is dying, trying to repair its content may make things even worse. The usual procedure is migrate (using a tool specialized for data recovery, like ddrescue), only then repair.
And if the node is young, it’s a perfect opportunity to treat it as a lab case for learning, so that if it happens to more important data, you’ll be better prepared.
I don’t have logs/details to back this up: so treat this as a story…
Piecestore: When I had disk-health issues: most of the time you can image/ddrescue the disk to recover all-that-you-can… then fsck/scrub the fresh filesystem. You lose a few .sj1 files but your audit score barely changes and you go on with your life. Easy.
Hashstore: Twice now disk-health issues came with meta/hashstable read errors. So on the recoverd filesystem I still had to use write-hashtbl. And although that util ran successfully… in both cases within 24h of restarting the node it was disqualified (even though both started with 100% audit scores, and over 90% online scores). I don’t know what the satellite is asking for, so often and so quickly… that can kill a node in hours.
I just had a hashstore-recovered node get DQ’d ten minutes ago: and I started it up this morning.
What does this mean? I don’t know. Maybe in the hashstore cases the HDD damage really was so severe that large swaths of data was lost. But recovery only complained about perhaps a dozen files. But it feels like small disk errors can blow larger holes into hashstore data: or the audit process is somehow more sensitive in detecting them.
Healthy hashstore packs metadata closer together, so a single faulty sector impacts more pieces. Unhealthy hashstore needs to recover from log files, which are inherently sequential—it’s difficult to recover everything that lies past a single error. And write-hashtbl is not magic: it cannot report on pieces it is not able to detect past that kind of problem. These two design decisions lead to more difficult recovery in case of bad hardware, so yes, your intuition is correct: a small disk error can impact many more pieces.
That makes me fear when Storj eventually forces hashstore-migration of old data. If today I can recover from 95% of errors… and hashstore may reduce that to 50%.
I’d rather lose some capacity to parity than risk losing 1-3 years of used-space.
As far as I understand the code it scans logfiles from the end and checks any byte boundary for a valid header. So it can handle logfiles with holes. I have used it several times successful with such files.
Again, this is assuming the log file is intact, and this requires file metadata to be correct. A single bad write to file metadata means losing a single piece in piecestore, but potentially many of them in hashstore.
I am using ZFS with no parity so any bad sector results in a data loss of zfs blocksize. All pieces covered by this block are lost no matter if using piecestore or hashstore. Anything else can be restored in both cases, no difference in my opinion.
I can only guess, but I assume from the Windows paths in the original post that @HGPlays uses NTFS. I found this cute case of NTFS not recovering after a crash correctly, overwritting some random files.
Yeah, but these are easier to recover in NTFS from MFT.
It all boils down to, in what characteristics do you trust your file system to behave reliably in presence of failures. Obviously this differs from implementation to implementation. And by design hashstore has different reliability needs from piecestore. Maybe—just maybe—ZFS matches the expected reliability profile better than NTFS…
ZFS keeps redundant copies of metadata so a single lost sector may not lose anything or just one piece. If you lose a file that may be hundreds with hashstore but only one piece with piecestore. losing a directory makes my head hurt to think about so just say it is just bad with either.