[Tech Preview] Hashstore backend for storage nodes

agente · September 23, 2025, 8:00am

Thanks but I mean.. In my case, something is causing compaction on some nodes to take an extremely long time. 54 hours seems like an exaggerated value, and it has happened to me on multiple nodes. I don’t know if it’s due to a loop related to 0-byte files or something else. I just wanted to highlight my situation and let the team know that other nodes might face similar issues—especially nodes run by people who don’t even read the forum. I wasn’t looking for a specific fix for my case but a fix in the node software

RecklessD · September 23, 2025, 8:08am

Longest compaction I found on my nodes is 24s - 3TB stored.on that node.

agente · September 23, 2025, 8:24am

… causing compaction on FEW nodes to take extremely long time

elek · September 23, 2025, 9:05am

The plan is starting with using hashstore for new writes (WriteToNew). Active migration will be started later.

Rollout is driven by Satellite (during the checkins). Current config:

Write new data to hashstore:

SLC - 5%
AP1 - 0%
EU1 - 0%
US1 - 0%

Active migration:
ALL - 0%

It likely doesn’t make any difference, as 37% of the nodes are already on hashstore (manual setting from the node operators)

You can monitor your current status with checking the .migrate file.

For example (WriteToNew is the important part):

cat config/storage/hashstore/meta/12EayRS2V1kEsWESU9QMRseFhdxYxKicsiFmxrsLZHeLUtdps3S.migrate

{
  "PassiveMigrate": false,
  "WriteToNew": true,
  "ReadNewFirst": true,
  "TTLToNew": true
}

For active migration, check the migrate_chore file

ACarneiro · September 23, 2025, 10:30am

.. of which 25% are @Vadim

But seriously, I am surprised that so many early adopters have made the manual switch.

Ottetal · September 23, 2025, 10:58am

as 37% of the nodes are already on hashstore

Very interesting figure! Is that 37% of nodes who are completely migrated, has ongoing migrations, or has enabled a single WriteToNew flag?

elek · September 23, 2025, 3:14pm

Just WriteToNew flag. And 37% includes the select network as well (which was migrated months ago…)

ACarneiro · September 23, 2025, 3:55pm

Ah, that makes more sense although I have no idea how Select’s size compares to Public.

In all honesty, Select seems so different from Public in so many ways that quite frankly I just pretend it doesn’t even exist.

littleskunk · September 23, 2025, 4:11pm

The downside of short compaction is that the amount of reclaimed space is relative small. On my node that is not a problem. I still have a lot of free space and I don’t mind trading it for short compaction runtime. There might be a point in time in the future at which my node is almost full. At that point I would change the config of my node to reclaim more space with the downside of spending more time with running compact. But hey at that point we would be talking about maximum payout and just optimising for some extra cent. Its like the final challenge that I would love to get to.

agente · September 23, 2025, 4:19pm

I have free space, Any advice on how to reduce compaction times at the expense of a bit of wasted space? On 10 TB nodes it’s starting to become a heavy task.

littleskunk · September 23, 2025, 4:34pm

      --hashstore.compaction.alive-fraction float                if the log file is not this alive, compact it (default 0.25)
      --hashstore.compaction.probability-power float             power to raise the rewrite probability to. >1 means must be closer to the alive fraction to be compacted, <1 means the opposite (default 2)

Decreasing the first one would queue up less log files for compaction but it also means more overhead on disk. I would also increase the second value to decrease the probability that it might pick up a log file that is close to the allive threshold.

arrogantrabbit · September 23, 2025, 5:48pm

Well the first step is to look at system behaviour during this long compaction. Disk queues, disk IO, memory pressure, wait times, compute pressure, etc. Otherwise it’s unproductive guesswork.

Repeatedly parsing zero sized files contributes to IO, and needs to be fixed regardless.

arrogantrabbit · September 23, 2025, 5:50pm

What’s the difference between “finished compaction” and “compact once finished”?

There also appears to be some shenanigans with time formatting:

storj-five# grep "compact once finished" /var/log/storagenode.log | grep -o '{.*}' | jq '[.stats.LogsRewritten, .duration] | join(" ")'  
" 426.025834ms"
" 1.759233558s"
" 13.31169607s"
" 16.526177553s"
" 14.228507974s"
" 14.692018117s"
" 19.417338659s"
" 1.840648903s"
" 19.196724614s"
" 33.832226ms"

# grep "finished compaction" /var/log/storagenode.log | grep -o '{.*}' | jq '[.stats.LogsRewritten, .duration] | join(" ")' 
"3 426.204768ms"
"8 1m0.518026162s"
"9 19.417862644s"
"6 21.037603812s"
"0 34.087853ms"

What is 1m0.518026162s?

mike · September 23, 2025, 6:02pm

Compact once finished
A compaction loop (many can run if the store requires it for each sat s0/s1 it)

Finished compaction
All loops for compaction has completed for that sat s0/s1 store.

Likely you will see multiple “compact once finished” before a “Finished compaction” is thrown.

arrogantrabbit · September 23, 2025, 6:06pm

Is “Compact once” a loop per satellite or one pass of compaction, and these passes will continue being scheduled until there are no more logs that satisfy compaction criteria?

mike · September 23, 2025, 6:06pm

A few days ago (when the latest deletion batch were purged) I had some fairly long compaction sessions as well. I think they were 6-8hours from normally ms > a few minutes. Also, they had many compact runs before the completion.

Are you back to normal compaction time again ?

mike · September 23, 2025, 6:09pm

It’s one pass of compaction per store per sat (2 x stores for each sat). It will continue it’s passes until compaction is satisfied.

I think we are actually trying to say the same thing, I’m just less fluent in english

arrogantrabbit · September 24, 2025, 4:11am

github.com/storj/storj

Malformed duration in hashstore compaction completion log message

opened 04:10AM - 24 Sep 25 UTC

arrogantrabbit

Bug

Example: ``` 2025-09-23T02:32:01-07:00 INFO hashstore finished compaction {"Proc…ess": "storagenode", "satellite": "121RTSDpyNZVcEU84Ticf2L1ntiuUimbWgfATz21tuvgk3vzoA6", "store": "s1", "duration": "1m0.518026162s", "stats": {"NumLogs":165,"LenLogs":"153.7 GiB","NumLogsTTL":11,"LenLogsTTL":"462.9 MiB","SetPercent":0.741129900078888,"TrashPercent":0.06573437127923672,"TTLPercent":0.0042861090925299565,"Compacting":false,"Compactions":4,"Today":20354,"LastCompact":20354,"LogsRewritten":8,"DataRewritten":"2.9 GiB","DataReclaimed":"4.1 GiB","DataReclaimable":"39.8 GiB","Table":{"NumSet":512253,"LenSet":"113.9 GiB","AvgSet":238754.2474519427,"NumTrash":20118,"LenTrash":"10.1 GiB","AvgTrash":539198.8929317029,"NumTTL":1228,"LenTTL":"674.5 MiB","AvgTTL":575978.319218241,"NumSlots":1048576,"TableSize":"64.0 MiB","Load":0.4885225296020508,"Created":20354,"Kind":0},"Compaction":{"Elapsed":0,"Remaining":0,"TotalRecords":0,"ProcessedRecords":0}}} ``` Problematic sting: `"duration": "1m0.518026162s"` What's "1m0.518026162s"?

This is json, supposed to be machine readable, there is no need to do acrobatics with units. Always write duration in seconds, without units. This will remove all issues.

Ottetal · September 24, 2025, 7:29am

Thank you for the insight and clarification. 37% is very much a non-significant amount of people. It could point towards the forum being filled with many, many more lurkers than I anticipated before. Reddit published numbers in the mid 10s, where they stated that ~95% of users did not participate in comments, and ~98% not in posts. With the insane push to mobile devices in the last couple of years, I can only imagine these numbers are higher now.

agente · September 24, 2025, 7:59am

they are very old nodes. probably deletion batch consequence. I will monitor next compact runs