Hashstore Migration Guide

Hello Community,

I see many Infos how to migrate to Hashstore.

Is there one Document available with all Options, Settings and stuff ?

When to restart the node ? Explanation of Environment Variables ?

I try to collect and update this Thread:

For Migration (passive / active) Files need to be modified in your Storage Location:

  • /<storagelocation>/storage/hashstore/meta

Also note the permissions of the `.migrate and .migrate_chore files may be very restrictive. You might have to modify permissions on the files to allow writing, or make modifications as the specific owner.

Enable passive migration. (requires version v1.119)

Info:

  • WriteToNew will send all incoming uploads to the hashstore

  • TTLToNew will send only uploads with a TTL to the hashstore

  • ReadNewFirst will check any new download request first in hashstore then piecestore. Use true if you start using hashstore.

  • PassiveMigrate will migrate any piece that gets hit by a download request

todo:
1) cd /<your node location>/storage/hashstore/meta
2) use 1 to 4 strings below (one for each satellite)

3) 
echo '{"PassiveMigrate":true,"WriteToNew":true,"ReadNewFirst":true,"TTLToNew":true}' > 121RTSDpyNZVcEU84Ticf2L1ntiuUimbWgfATz21tuvgk3vzoA6.migrate
echo '{"PassiveMigrate":true,"WriteToNew":true,"ReadNewFirst":true,"TTLToNew":true}' > 12EayRS2V1kEsWESU9QMRseFhdxYxKicsiFmxrsLZHeLUtdps3S.migrate
echo '{"PassiveMigrate":true,"WriteToNew":true,"ReadNewFirst":true,"TTLToNew":true}' > 12L9ZFwhzVpuEKMUNUqkaTLGzwY9G24tbiigLiXpmZWKwmcNDDs.migrate
echo '{"PassiveMigrate":true,"WriteToNew":true,"ReadNewFirst":true,"TTLToNew":true}' > 1wFTAgs9DP5RSnCqKV1eLf6N9wtk4EAtmN5DpSxcs8EjT69tGE.migrate

Enable active migration (requires version v1.120)

Info:

  • Will take multipe days with high CPU time.

  • You can enable it for all satellites at the same time.

  • The migration will run through them one by one.

  • single migration per satellite between node upgrades possible ? powerfull cpu ?

todo:
1) cd /<storagelocation>/storage/hashstore/meta
2) use 1 to 4 strings below (one for each satellite)

3)
echo -n true > 121RTSDpyNZVcEU84Ticf2L1ntiuUimbWgfATz21tuvgk3vzoA6.migrate_chore
echo -n true > 12EayRS2V1kEsWESU9QMRseFhdxYxKicsiFmxrsLZHeLUtdps3S.migrate_chore
echo -n true > 12L9ZFwhzVpuEKMUNUqkaTLGzwY9G24tbiigLiXpmZWKwmcNDDs.migrate_chore
echo -n true > 1wFTAgs9DP5RSnCqKV1eLf6N9wtk4EAtmN5DpSxcs8EjT69tGE.migrate_chore

CONFIG-Options in Environment Variables:

(note these will be changing to docker command line options in a future release after 135.5)

  - STORJ_HASHSTORE_COMPACTION_PROBABILITY_POWER=2 (default)

  - STORJ_HASHSTORE_COMPACTION_REWRITE_MULTIPLE=10 (default)

  - STORJ_HASHSTORE_MEMTABLE_MAX_SIZE=128MiB (default ????) 

  - STORJ_HASHSTORE_TABLE_DEFAULT_KIND=hashtbl (default)

  - STORJ_HASHSTORE_MEMTBL_MMAP=false (default)


CONFIG-Options in Docker:

--hashstore.table-path (if you have SSD, use that one, but monitor degradation)

--hashstore.table-default-kind (if you have at least TB * 1.3 = GB memory)

--hashstore.hashtbl.mmap (if you know what is mmap, but don’t have enough memory, neither SSD, but still interested to check if it’s faster for hashtbl)

--hashstore.compaction.rewrite-multiple (if you have enough space, one compaction could rewrite more logs. You need * free space in addition to the used space)

--hashstore.compaction.alive-fraction: if you are fine with heavier IO / in exchange of less stored garbage (I use 0.65 on some nodes)

Compaction

Compaction is the process of removing dead bytes from the log files. It runs once per day for each DB.

Most of the time you don’t need to touch the configuration of the compaction.

But if you are really interested about tuning it manually, the most important configurations are:

--hashstore.compaction.alive-fraction float
--hashstore.compaction.probability-power float
--hashstore.compaction.rewrite-multiple float

And they are expoained here:

hashtbl vs memtbl

  • Info: need atleast TB * 1.3 = GB memory

  • --hashstore.table-default-kind memtbl

use SSD

  • create dir on ssd: mkdir //table

  • --hashstore.table-path

How to check Logfiles

docker logs storagenode 2>&1 | grep "piecemigrate:chore"

How to check a progress

How to check a progress with strace

sudo strace -p your_storagenode_pid -f -e trace=file -tt 2>&1 | grep blob

To get your storagenode pid, run

sudo docker top storagenode

Find the one that has the storagenode run on it.

How to check a progress with fatrace

cd /path/to/storj/node-data
fatrace -ct

Migration finished ? Cleanup your blobs

  • delete files inside //storage/blobs/

    but dont delete blobs directory itself

bash

to find all non-zero files which could remain:

find /mnt/storagenode/blobs -type f -size +0

to remove all under /mnt/storagenode/blobs:

rm -rf /mnt/storagenode/blobs/*

zsh

to check:

ls -l /mnt/storagenode/blobs/**/*(.L1)

or

stat -f '%z %N' /mnt/storagenode/blobs/**/*(.L1)

to remove:

rm -rf /mnt/storagenode/blobs/*

PowerShell

to check

Get-ChildItem -Path "X:\storagenode\storage\blobs" -Recurse -Directory | Where-Object { (Get-ChildItem -LiteralPath $_.FullName).Count -eq 0 } | Remove-Item -WhatIf

to remove only empty

Get-ChildItem -Path "X:\storagenode\storage\blobs" -Recurse -Directory | Where-Object { (Get-ChildItem -LiteralPath $_.FullName).Count -eq 0 } | Remove-Item

Errorhandling

  • Dashboard shows incorrect used space
    • → stop node, delete storage/used_space_per_prefix.db, restart node

Example Docker-String

coming soon

Example Windows-String

coming soon

Greetings Michael

4 Likes

the formatting with the “headers” got kind rough there.

the first bit with the passive migrate echo lines is messed up formatting in two ways.

  1. the outer encapsulator is a single quote ’ and needs to be a double quote (edit: no wait, the outer single quote is good, it just can’t be slantey)

  2. inside the echo line, the double quotes are “windows style” slantey quotes and not unix style straight double quotes so I get an error like this:

storj14003 | 2025-09-01T17:10:29Z WARN failed to unmarshal migration state {“Process”: “storagenode”, “error”: “invalid character ‘â’ looking for beginning of object key string”, “satellite”: “1wFTAgs9DP5RSnCqKV1eLf6N9wtk4EAtmN5DpSxcs8EjT69tGE”}

at the end of the day each .migrate file should look something like this:

{"PassiveMigrate":true,"WriteToNew":true,"ReadNewFirst":true,"TTLToNew":true}

how to post those links without line break ?

something like an additional (longer) field ?

Michael

Great initiative :smiling_face_with_sunglasses:

Perhaps you should add info about where to place the files ? (/storage/hashstore/meta)

Also things to consider adding:

  • how to check logs for progress
  • how to delete obsolete folders in /storage/blobs//* after migration
  • how to solve the double-usage data-usage with used_space_per_prefix.db
  • explain what memtbl settings does and how especially that it is not a requirement (it will use a few gigs og ram on larger nodes)

you can use the “preformatted text” quote option.

Important prerequisite before setting these lines.

need to move to your storage folder that has the hashstore meta. on my system it is: /storagedrive/storage/hashstore/meta

And the permissions are really restrictive on the meta files. in my case the owner was “root” so I had to be logged in as root to issue the echo commands.

here is what my echo lines looked like that worked:


echo '{"PassiveMigrate":true,"WriteToNew":true,"ReadNewFirst":true,"TTLToNew":true}' > 121RTSDpyNZVcEU84Ticf2L1ntiuUimbWgfATz21tuvgk3vzoA6.migrate
echo '{"PassiveMigrate":true,"WriteToNew":true,"ReadNewFirst":true,"TTLToNew":true}' > 12EayRS2V1kEsWESU9QMRseFhdxYxKicsiFmxrsLZHeLUtdps3S.migrate
echo '{"PassiveMigrate":true,"WriteToNew":true,"ReadNewFirst":true,"TTLToNew":true}' > 12L9ZFwhzVpuEKMUNUqkaTLGzwY9G24tbiigLiXpmZWKwmcNDDs.migrate
echo '{"PassiveMigrate":true,"WriteToNew":true,"ReadNewFirst":true,"TTLToNew":true}' > 1wFTAgs9DP5RSnCqKV1eLf6N9wtk4EAtmN5DpSxcs8EjT69tGE.migrate

I converted the first post to the wiki, now you can edit it and make it a fulfil documentation.

There are way too many values here. I believe most of the time, it’s better to keep the defaults. Unless you really understand all of these settings, and you need them to fine-tune SN for your hardware.

It would be better to show the documentation of these, and explain when they are useful, instead of recommending as default options.

The STORJ_HASHSTORE_TABLE_DEFAULT_KIND=memtbl is probably the difference (still it should be used with enough memory). Same for MMAP. And AFAIK MLOCK is true by default.

2 Likes

Is it possible to know what row do what? and requirements to use it.

Could you please summarize what lines/parameters should be entered in the config file for the different options if I use Windows?

From the next release, it will be possible to change them also by configuration settings or command line flag.

Therefore the help can be printed out:

There is no requirement to use these, our goal is to make the defaults smart enough to work well for everybody. IMHO there shouldn’t be any requirement to learn internals, just to be a successful SNO.

Still we can write a Tuning Guide, but it’s a bigger effort.

My generic recommendation: set flags only if you understand the consequences :smiley:

If you really like knobs, and you MUST adjust them, I would start with these:

  • --hashstore.table-path (if you have SSD, use that one, but monitor degradation)
  • --hashstore.table-default-kind (if you have at least TB * 1.3 = GB memory)
  • --hashstore.hashtbl.mmap (if you know what is mmap, but don’t have enough memory, neither SSD, but still interested to check if it’s faster for hashtbl)
  • --hashstore.compaction.rewrite-multiple (if you have enough space, one compaction could rewrite more logs. You need * free space in addition to the used space)
  • --hashstore.compaction.alive-fraction: if you are fine with heavier IO / in exchange of less stored garbage (I use 0.65 on some nodes)
4 Likes

Since it’s a wiki I updated the first post just to show the default values.

It will be nice when it’s regular command line options

1 Like

Do we have to migrate to hashstore? Is it mandatory? I read somewhere it will be default in coming versions and all nodes will be automatically migrated. Was it true? Should I wait for that if it will be automatically be migrated or do I need to take other actions?

It will be mandatory, automatic, and gradual.

I’ve been working testing it gradually (my tiniest node) on it because most of my storage uses NFS mounted storage (which storj doesn’t support) and it breaks on the new hashstore unless I make adjustments.

I have put true into migrate_chore, I see disk activity, I see size of the hashstore folder grow, however in the node log I don’t see new uploads going to hashstore, they continue to be logged by piecestore.

What am I missing?

The display is in piecestore. That is what I also wondered about, why it is not displayed as hashstore. But when you check the blobs folder it is empty, so everything works as hashstore apart from the display in the logs yet.

Just check when the old blobs are completed into hashlogs, then everything works as hashstore.

2 Likes

Probably nothing, logging function does not display hashstore but piecestore regardless of the used storage backend.

However, in order for new pieces to go into hashstore, you also need these files:

storage/hashstore/meta/121RTSDpyNZVcEU84Ticf2L1ntiuUimbWgfATz21tuvgk3vzoA6.migrate
storage/hashstore/meta/12EayRS2V1kEsWESU9QMRseFhdxYxKicsiFmxrsLZHeLUtdps3S.migrate
storage/hashstore/meta/12L9ZFwhzVpuEKMUNUqkaTLGzwY9G24tbiigLiXpmZWKwmcNDDs.migrate
storage/hashstore/meta/1wFTAgs9DP5RSnCqKV1eLf6N9wtk4EAtmN5DpSxcs8EjT69tGE.migrate

to contain this:

{"PassiveMigrate":false,"WriteToNew":true,"ReadNewFirst":true,"TTLToNew":true}

.. but I’m assuming you already did this?

I was wondering the same thing initially when I migrated my nodes, but found piece of mind when using fatrace -ct in the SN path and saw the live traffic go to hashstore. I’m sure there is an equivalent in FreeBSD to double check.

2 Likes

What does false, true, true, true do? Was the recommendation not to set all to true?

PassiveMigrate moves a piece from piecestore to hashstore if it is served by a client request. It’s a little redundant if you are doing active migration, but probably wouldn’t hurt :slight_smile:

No, I haven’t. Should I have? I got an impression that those files (mostly based on their names) control migration: active vs passive. And since I’m doing active migration, passive configuration should be irrelevant.

I’ll wait for it to finish, and then check where do actual writes go. Thanks!