[Tech Preview] Hashstore backend for storage nodes

Hi,

After migrating my node to hashstore, it still have several blobs left, shouldn’t they all be migrated ? or are they remnants of old removed satellites?

find blobs/* -type f  | wc -l
90322

ls blobs
6r2fgwqz3manwt4aogq343bfkh2n5vvg4ohqqgggrrunaaaaaaaa  arej6usf33ki2kukzd5v6xgry2tdr56g45pp3aao6llsaaaaaaaa

this is too after execute the clean:

find blobs/* -type f -empty -delete
find blobs/* -type d -empty -delete

this happens in 4 node, and all of them has this 2 folders, is secure to delete this folders ?

thanks

i am running just a single node on a ZFS pool where a bunch of other things are happening, such as torrents, backups, etc.

but it looks like the compaction finished this morning, as i see the last compaction message from 9:43 AM. Looking at the times they were really diverse, from a few ms to 40 hours (and a bunch of other taking 26h, 18h, etc.) and it took like 8 days to finish.

and i am still getting these migration messages all the time in the logs. are these normal?

2025-03-22T21:04:32+01:00       INFO    piecemigrate:chore      enqueued for migration  {"Process": "storagenode", "sat": "12L9ZFwhzVpuEKMUNUqkaTLGzwY9G24tbiigLiXpmZWKwmcNDDs"}
2025-03-22T21:04:32+01:00       INFO    piecemigrate:chore      enqueued for migration  {"Process": "storagenode", "sat": "1wFTAgs9DP5RSnCqKV1eLf6N9wtk4EAtmN5DpSxcs8EjT69tGE"}
2025-03-22T21:04:32+01:00       INFO    piecemigrate:chore      all enqueued for migration; will sleep before next pooling      {"Process": "storagenode", "active": {"12EayRS2V1kEsWESU9QMRseFhdxYxKicsiFmxrsLZHeLUtdps3S": true, "12L9ZFwhzVpuEKMUNUqkaTLGzwY9G24tbiigLiXpmZWKwmcNDDs": true, "1wFTAgs9DP5RSnCqKV1eLf6N9wtk4EAtmN5DpSxcs8EjT69tGE": true, "121RTSDpyNZVcEU84Ticf2L1ntiuUimbWgfATz21tuvgk3vzoA6": true}, "interval": "10m0s"}
2025-03-22T21:14:30+01:00       INFO    piecemigrate:chore      couldn't migrate        {"Process": "storagenode", "error": "opening the old reader: pieces error: invalid piece file for storage format version 1: too small for header (0 < 512)", "errorVerbose": "opening the old reader: pieces error: invalid piece file for storage format version 1: too small for header (0 < 512)\n\tstorj.io/storj/storagenode/piecemigrate.(*Chore).migrateOne:318\n\tstorj.io/storj/storagenode/piecemigrate.(*Chore).processQueue:260\n\tstorj.io/storj/storagenode/piecemigrate.(*Chore).Run.func2:167\n\tstorj.io/common/errs2.(*Group).Go.func1:23", "sat": "12EayRS2V1kEsWESU9QMRseFhdxYxKicsiFmxrsLZHeLUtdps3S", "id": "WEOQ6T5BYKKB4AZDK23NQNTUURDOP7GHY64NLDVO37C2A4PL7RMQ"}
2025-03-22T21:14:31+01:00       INFO    piecemigrate:chore      couldn't migrate        {"Process": "storagenode", "error": "opening the old reader: pieces error: invalid piece file for storage format version 1: too small for header (0 < 512)", "errorVerbose": "opening the old reader: pieces error: invalid piece file for storage format version 1: too small for header (0 < 512)\n\tstorj.io/storj/storagenode/piecemigrate.(*Chore).migrateOne:318\n\tstorj.io/storj/storagenode/piecemigrate.(*Chore).processQueue:260\n\tstorj.io/storj/storagenode/piecemigrate.(*Chore).Run.func2:167\n\tstorj.io/common/errs2.(*Group).Go.func1:23", "sat": "12EayRS2V1kEsWESU9QMRseFhdxYxKicsiFmxrsLZHeLUtdps3S", "id": "7FY52IOAKIO4WGT56OPC33TQ5JFLKJSU3IGZYUQPUM2EOONJKPVQ"}
2025-03-22T21:14:31+01:00       INFO    piecemigrate:chore      enqueued for migration  {"Process": "storagenode", "sat": "12EayRS2V1kEsWESU9QMRseFhdxYxKicsiFmxrsLZHeLUtdps3S"}
2025-03-22T21:14:32+01:00       INFO    piecemigrate:chore      enqueued for migration  {"Process": "storagenode", "sat": "12L9ZFwhzVpuEKMUNUqkaTLGzwY9G24tbiigLiXpmZWKwmcNDDs"}
2025-03-22T21:14:32+01:00       INFO    piecemigrate:chore      enqueued for migration  {"Process": "storagenode", "sat": "1wFTAgs9DP5RSnCqKV1eLf6N9wtk4EAtmN5DpSxcs8EjT69tGE"}
2025-03-22T21:14:32+01:00       INFO    piecemigrate:chore      enqueued for migration  {"Process": "storagenode", "sat": "121RTSDpyNZVcEU84Ticf2L1ntiuUimbWgfATz21tuvgk3vzoA6"}
2025-03-22T21:14:32+01:00       INFO    piecemigrate:chore      all enqueued for migration; will sleep before next pooling      {"Process": "storagenode", "active": {"12EayRS2V1kEsWESU9QMRseFhdxYxKicsiFmxrsLZHeLUtdps3S": true, "12L9ZFwhzVpuEKMUNUqkaTLGzwY9G24tbiigLiXpmZWKwmcNDDs": true, "1wFTAgs9DP5RSnCqKV1eLf6N9wtk4EAtmN5DpSxcs8EjT69tGE": true, "121RTSDpyNZVcEU84Ticf2L1ntiuUimbWgfATz21tuvgk3vzoA6": true}, "interval": "10m0s"}

I believe so.
Check those blobs folders in this post

1 Like

@Mark Thanks for the help, I didn’t know that post about satellites.

I just looked at them and if they are old satellites, I guess there will be no problem to delete them.

TTL data is cheap to compact. The storage node puts all pieces with the same TTL into the same LOG file. Once the time is reached it deletes that LOG file without having to rewrite anything.

The node can’t predict which pieces might get deleted by garbage collection. That will be a bit more expensive for the compact job.

4 Likes

Then it will not be append-only anymore, so the main advantage will be thrown away.

1 Like

I have another ideea.
Put all the short TTL pieces in the same logs, and don’t do any compaction; just wait for all the pieces to expire and delete the entire log file.
The sticky data pieces will have their own log files, that will require less frequent compaction.
Another thing would be to make smaller log files, that can be compacted quickly.

Excellent idea. Let’s take it to the next level and make log files the size of the pieces themselves. One piece per log file. It’s called “files on disks”

6 Likes

You are very late with that “idea” as you can read up here: storagenode/hashstore: clump pieces by ttl · storj/storj@1617c0a · GitHub

5 Likes

i had really high hopes in the hashstore, and the initial experience was amazing with massively decreased disk access after the hashstore (for comparison, ZFS scrub on piecestore was taking over 10 days, but dropped to ~25 hours after the migration)
however since 1.124 hashstore is hogging up my drives almost the same way as picestore was

i am constantly getting high I/O by the storagenode process, resulting in my disk pool being slow for other processes.

i don’t get why this was not happening on pre-1.124 versions and how i can disable this behaviour. I am starting to count down the months until I get back my held back amount and i can finally exit this.

1 Like

Previous versions were too lazy with compacting, new version playing catch-up, apparently compacting pretty much everything. On my meager 2TB node, it took maybe a day or so, then it calmed down once again.

Just today, about 100GB of trash got deleted and that triggered about 1.5h compaction burst. No small time, that, but nothing like the “catch-up” disk thrashing.

FWIW, I moved hashtables to an SSD yesterday (using symlinks), that may have sped up the most recent compaction some. Quite a lot of hashtable updating going on during compaction, it seems. Anyway, trash deletion isn’t a daily thing - more like a weekly one - judging by my own experience.

Be warned: my node is experimental, I’ve migrated it several times, messed around with it quite a bit, yet it has miraculously survived for almost 2 years. Probably not wise to repeat my moves for “production” nodes… if mine dies, it dies, zero tears shed.

One more year, give or take a few months, and I will have to exit (gracefully, I hope) anyway. Then start fresh after major (about three months) renovations to my habitat. Presumably with much speedier internet connection, upload in particular.

Having said that, I am willing to try memtables once they are released and someone more knowledgeable instructs me how to do that.

UPDATE: While waiting for the official release with memtbl, I scraped together my own “solution”: My "memtables" kludge

3 Likes

Can we calm the I/O of HDD if we move hashtable DB to a SSD?

Unlikely. The compaction is going on with logs, they are analogue of blobs in the piecestore backend. Hash tables are relatively small and are usually cached in memory by the file system.

So, Is it safe to use this feature now? Is it tuned for better than it was last year?

It’s not in production yet.
Seems “safe” insofar as nobody is reporting nodes dying after deployment but I, for one, am waiting for it to mature a lot more before taking the plunge.

2 Likes

Yeah, once new nodes install with hashstore by default… then I’ll look into migration. But nodes have been mostly idle for six months now, so piecestore is working fine and there’s no rush.

6 Likes

It’s still in beta phase I suppose, but I don’t see a future for hashstore if continues the same behavior like now. Those compactions hammer the drive non-stop. For the piecestore, without Badger cache, there was only intense reading, at startup, and with trash movement, but with hashstore there is also intense writing. The performance benefits are going out the window.
They should implement an option to go back to piecestore. It was a bad move to switch to hashstore.
I’ll keep it 3 more months maybe, because it’s a new node, with 1.5TB, and if nothing improves, I will switch all flags to false and wait for it to move slowly back to piecestore and badger.

2 Likes

I will keep my 2 test nodes with hashstore (one windows, one linux) running but not converting my other nodes until I really can see a benefit. Both hashstore and filestore working without problems on my hardware atm.

I guess storj is developing hashstore primarily for select and the kind of storage they have to deal with there. This is probably not our good old one node per hdd thing.

You may disable the migration to hashstore and the node would serve both backends.

1 Like

It’s still “one node per HDD” way. The only difference that there are much more intensive traffic (hundreds Gbps), so the hashstore is must have.