[Tech Preview] Hashstore backend for storage nodes

Mark · January 27, 2025, 9:01pm

No, It makes those files for everybody but in the current software version the hashstore is not used unless you enable it using the info found in the first post at the top of this thread.

surfercool · January 28, 2025, 6:14pm

I’ve performed a full migration of my 2.5 TB node (60TB). The previous count of nearly 18,000,000 extremely small files was a major performance bottleneck. This migration has reduced the file count to approximately 2,500, dramatically! improving filesystem performance. I’m hopeful for a smooth transition and the widespread adoption of the hashstore standard.

Vadim · January 28, 2025, 7:39pm

I wold wait until there is some repear mechanics implemented, but looks promising. even moving this kind of node will be easier in future. bigger files copy work much faster.

Mark · January 29, 2025, 6:51pm

Is the garbage collection system fully implemented and working with hashstore? Is there any new logging I should search for to verify functionality on my node?

Alexey · January 30, 2025, 3:10am

Yes, it’s implemented, it’s called compaction.

perhaps, depends on the used version (not all versions have logging). I wouldn’t say what you need to search, but try with “hash”. It’s possible that you need to enable Debug log level.

I have for the storj-up node built from main:

storagenode10-1  | 2025-01-30T01:49:48Z INFO    hashstore       hashstore/store.go:608  beginning compaction    {"Process": "storagenode", "satellite": "12whfK1EDvHJtajBiAUeajQLYcWqxcQmdYQU5zX5cCf6bAxfgu4", "store": "s1", "stats": {"NumLogs":3,"LenLogs":"288.6 MiB","NumLogsTTL":2,"LenLogsTTL":"160.4 MiB","SetPercent":1,"TrashPercent":0,"Compacting":false,"Compactions":0,"TableFull":0,"Today":20118,"LastCompact":0,"LogsRewritten":0,"DataRewritten":"0 B","Table":{"NumSet":18,"LenSet":"288.6 MiB","AvgSet":16814144,"NumTrash":0,"LenTrash":"0 B","AvgTrash":0,"NumSlots":16384,"TableSize":"1.0 MiB","Load":0.0010986328125,"Created":20115},"Compaction":{"Elapsed":0,"Remaining":0,"TotalRecords":0,"ProcessedRecords":0}}}
storagenode10-1  | 2025-01-30T01:49:48Z INFO    hashstore       hashstore/store.go:707  compact once started    {"Process": "storagenode", "satellite": "12whfK1EDvHJtajBiAUeajQLYcWqxcQmdYQU5zX5cCf6bAxfgu4", "store": "s1", "today": 20118}
storagenode10-1  | 2025-01-30T01:49:48Z INFO    hashstore       hashstore/store.go:858  compaction computed details    {"Process": "storagenode", "satellite": "12whfK1EDvHJtajBiAUeajQLYcWqxcQmdYQU5zX5cCf6bAxfgu4", "store": "s1", "nset": 14, "nexist": 18, "modifications": true, "curr logSlots": 14, "next logSlots": 14, "candidates": [3], "rewrite": [3], "duration": "62.945902ms"}
storagenode10-1  | 2025-01-30T01:49:48Z INFO    hashstore       hashstore/store.go:1061 hashtbl rewritten       {"Process": "storagenode", "satellite": "12whfK1EDvHJtajBiAUeajQLYcWqxcQmdYQU5zX5cCf6bAxfgu4", "store": "s1", "total records": 14, "total bytes": "224.5 MiB", "rewritten records": 0, "rewritten bytes": "0 B", "trashed records": 3, "trashed bytes": "48.1 MiB", "restored records": 0, "restored bytes": "0 B", "expired records": 4, "expired bytes": "64.1 MiB"}
storagenode10-1  | 2025-01-30T01:49:48Z INFO    hashstore       hashstore/store.go:709  compact once finished   {"Process": "storagenode", "satellite": "12whfK1EDvHJtajBiAUeajQLYcWqxcQmdYQU5zX5cCf6bAxfgu4", "store": "s1", "duration": "646.935909ms", "completed": true}
storagenode10-1  | 2025-01-30T01:49:48Z INFO    hashstore       hashstore/store.go:610  finished compaction     {"Process": "storagenode", "satellite": "12whfK1EDvHJtajBiAUeajQLYcWqxcQmdYQU5zX5cCf6bAxfgu4", "store": "s1", "duration": "647.461705ms", "stats": {"NumLogs":2,"LenLogs":"224.5 MiB","NumLogsTTL":1,"LenLogsTTL":"96.2 MiB","SetPercent":1,"TrashPercent":0.21428571428571427,"Compacting":false,"Compactions":0,"TableFull":0,"Today":20118,"LastCompact":20118,"LogsRewritten":1,"DataRewritten":"0 B","Table":{"NumSet":14,"LenSet":"224.5 MiB","AvgSet":16814144,"NumTrash":3,"LenTrash":"48.1 MiB","AvgTrash":16814144,"NumSlots":16384,"TableSize":"1.0 MiB","Load":0.0008544921875,"Created":20118},"Compaction":{"Elapsed":0,"Remaining":0,"TotalRecords":0,"ProcessedRecords":0}}}

snorkel · February 1, 2025, 3:35am

I started the hashstore migration on a node, on all sats, by setting all those to true and stop/restart the node, and the only logging that I see is the lack of this entrie once at 10 min:

 INFO    piecemigrate:chore      all enqueued for migration; will sleep before next pooling      {"Process": "storagenode", "active": {}, "interval": "10m0s"}

Once I started the migration, this entry is no more.

Funny thing; on the other node on which the migration didn’t start, I have a lot of entries regarding compaction, but with 0 values. The nodes are both on 121.2 version.

Mark · February 1, 2025, 5:19pm

Does your web interface show used space correctly after the migration?

Mark · February 1, 2025, 5:33pm

I just finished a full migration on my smallest node and I just saw some errors in my logs. Is this from the badger cache? Perhaps I should disable and remove the badger cache now that there are no more blobs files.

2025-02-01T17:25:17Z    ERROR   blobscache      piecesTotal < 0 {"Process": "storagenode", "piecesTotal": -15360}
2025-02-01T17:25:17Z    ERROR   blobscache      piecesContentSize < 0   {"Process": "storagenode", "piecesContentSize": -14848}
2025-02-01T17:25:17Z    ERROR   blobscache      satPiecesTotal < 0      {"Process": "storagenode", "satPiecesTotal": -15360}
2025-02-01T17:25:17Z    ERROR   blobscache      satPiecesContentSize < 0        {"Process": "storagenode", "satPiecesContentSize": -14848}
2025-02-01T17:25:17Z    ERROR   blobscache      piecesTotal < 0 {"Process": "storagenode", "piecesTotal": -1792}
2025-02-01T17:25:17Z    ERROR   blobscache      piecesContentSize < 0   {"Process": "storagenode", "piecesContentSize": -1280}
2025-02-01T17:25:17Z    ERROR   blobscache      satPiecesTotal < 0      {"Process": "storagenode", "satPiecesTotal": -1792}
2025-02-01T17:25:17Z    ERROR   blobscache      satPiecesContentSize < 0        {"Process": "storagenode", "satPiecesContentSize": -1280}
2025-02-01T17:25:17Z    ERROR   blobscache      piecesTotal < 0 {"Process": "storagenode", "piecesTotal": -145408}
2025-02-01T17:25:17Z    ERROR   blobscache      piecesContentSize < 0   {"Process": "storagenode", "piecesContentSize": -144896}
2025-02-01T17:25:17Z    ERROR   blobscache      satPiecesTotal < 0      {"Process": "storagenode", "satPiecesTotal": -145408}
2025-02-01T17:25:17Z    ERROR   blobscache      satPiecesContentSize < 0        {"Process": "storagenode", "satPiecesContentSize": -144896}
2025-02-01T17:25:17Z    ERROR   blobscache      piecesTotal < 0 {"Process": "storagenode", "piecesTotal": -145408}
2025-02-01T17:25:17Z    ERROR   blobscache      piecesContentSize < 0   {"Process": "storagenode", "piecesContentSize": -144896}
2025-02-01T17:25:17Z    ERROR   blobscache      satPiecesTotal < 0      {"Process": "storagenode", "satPiecesTotal": -145408}
2025-02-01T17:25:17Z    ERROR   blobscache      satPiecesContentSize < 0        {"Process": "storagenode", "satPiecesContentSize": -144896}

Aleksman4o · February 1, 2025, 5:51pm

I think that’s because of incorrect used space values. On a nodes with several TB there’s thousands such messages and this happens at the end of migration every satellite

Mark · February 1, 2025, 5:57pm

It seems I need a used space file walker for hashstore. This is now my dashboard: zero used space.

snorkel · February 1, 2025, 5:57pm

You should. badger+hashstore dosen’t impruve the performance of hashstore alone, but badger takes up RAM. So get rid of it.

Aleksman4o · February 1, 2025, 6:02pm

Just wait new storagenode release, this error has already been corrected. I’ve compiled it for myself from git.

github.com/storj/storj

storagenode: fix space usage calculations

committed 10:25PM - 22 Jan 25 UTC

zeebo

+134 -229

1. have the console take a monitor.SpaceReport instead of a piece store direc…tly because that isn't necessarily where all of the pieces are anymore. 2. remove the space used per satellite because as far as i can tell it isn't used by anything. maybe my grepping is insufficient and some runtime dynamic things are happening where the field is used but the name isn't present in the source code. it'd be nice if this was true because it's not really easy to get without combinging the info from the old piece store and the hashstore in some new interface that isn't the monitor.SpaceReport because that one can be backed by a dedicated disk, so it has no concept of the per satellite-ness of the usage. 3. adds a monkit call to the value we actually send up to the satellite which is what's important anyway. 4. no longer wrap the SpaceReport with the hashstore because it can't be done that way: the individual SpaceReport implementations take into account the free disk usage to calculate the amount of available space, and so the hashstore can't subtract the amount of space it used from that to get the a valid amount of space. it has to be done in the implementation before that check is done. Change-Id: I181a51d0277932d759a3367acabcf80e5350aaa1

Uploaded linux-x64 storagenode executable to google drive for whos who trusts executables from internet

snorkel · February 1, 2025, 6:06pm

Maybe it migrated to someone else’s node?
Just kidding. Enable startup piece scan, deactivate badger and lazzy mode, and restart.

Mark · February 1, 2025, 8:47pm

I’m on a raspberry pi5 so I compiled an arm64 version. You are correct, This version does work better. But I ~~will probably go~~ went back to a standard release version before storj sends an unmarked van to my house. If these numbers are correct, I’m storing a lot of trash on this node.

Mark · February 1, 2025, 9:17pm

I killed the badger but the blobscache errors came back after a restart. Deleting all the databases seems to have made them go away.

Edit: I was wrong. I’m still getting these errors. But fewer of them.

Perhaps it’s the TTL system trying to remove expired blobs files that don’t exist anymore.

Aleksman4o · February 1, 2025, 11:04pm

After ending migrate this errors should gone, just relax. You restarted the node and there started other satellite migrate, that’s why you think that problem solved. Now migrating data for this satellite ending and errors shows again.

Mark · February 1, 2025, 11:06pm

Migration already ended. This node is tiny. Only has about 100GB. It only has 1 satellite, AP1.

Aleksman4o · February 1, 2025, 11:15pm

Check that folder “blobs” have 0 byte size

Mark · February 1, 2025, 11:28pm

It is zero bytes. I think your errors are stopping because you are running the newest software version and your node knows how much data you have. My node thinks I have zero data and it is trying to subtract from zero when it deletes an expired piece resulting in a negative number error. At least that is my theory because the errors appear right after a log line that says “expired pieces collection started”.

Alexey · February 2, 2025, 4:39am

It’s fixed in main only, it’s not released yet.