Hashstore Migration Guide

I got the very same impression from the chosen wording. But both need to be set to true:

4 Likes

What do we do if there are a handful of files? Keep them? Nuke them?

storj-five# find /mnt/storagenode-five/blobs -type d | wc -l
    4101
storj-five# find /mnt/storagenode-five/blobs -type f | wc -l
       1
storj-five# find /mnt/storagenode-five/blobs -type f -print 
/mnt/storagenode-five/blobs/ukfu6bhbboxilvt7jrwlqk7y2tapb5d2r2tsmj2sjxvw5qaaaaaa/it/oawjsq4dfq6mvd54bhudnnlixptuyk5vkkrqbn6x2uq55mo7za.sj1

Why was this poor guy left alone?

Perhaps it was corrupted. If you have logs, you may search for this piece.
To reconstruct a PieceID you can use this method:

But in reverse direction. Then you can search for this PieceID in your logs to see maybe it caused an exception during migration.

Also, what’s size of this piece?

Oh indeed:

storj-five# bzgrep -i awjsq4dfq6mvd54bhudnnlixptuyk5vkkrqbn /var/log/storagenode.log* | cut -d ' ' -f2-  | sort | uniq -c
 523 migrate	{"Process": "storagenode", "error": "opening the old reader: pieces error: invalid piece file for storage format version 1: too small for header (0 < 512)", "errorVerbose": "opening the old reader: pieces error: invalid piece file for storage format version 1: too small for header (0 < 512)\n\tstorj.io/storj/storagenode/piecemigrate.(*Chore).migrateOne:335\n\tstorj.io/storj/storagenode/piecemigrate.(*Chore).processQueue:277\n\tstorj.io/storj/storagenode/piecemigrate.(*Chore).Run.func2:184\n\tstorj.io/common/errs2.(*Group).Go.func1:23", "sat": "12EayRS2V1kEsWESU9QMRseFhdxYxKicsiFmxrsLZHeLUtdps3S", "id": "ITOAWJSQ4DFQ6MVD54BHUDNNLIXPTUYK5VKKRQBN6X2UQ55MO7ZA"}

Its size is zero.

storj-five# find /mnt/storagenode-five/blobs -type f -exec stat -f '%z %N' {} \;
0 /mnt/storagenode-five/blobs/ukfu6bhbboxilvt7jrwlqk7y2tapb5d2r2tsmj2sjxvw5qaaaaaa/it/oawjsq4dfq6mvd54bhudnnlixptuyk5vkkrqbn6x2uq55mo7za.sj1
  1. What are circumnstances when zero sized files are created?
  2. why did migrate chore needed to try five hundred twenty trhee times to migrate it!??? I admire the dedication, but it’s a bit excessive!

So I guess I can now safely rm -rf ‘blobs/*’ ?

1 Like

See here…
https://forum.storj.io/t/hashstore-rollout-commencing/30589/157?u=snorkel

1 Like

Specifically:

Since we migrating to hashstore, this bug maybe wouldn’t be fixed.

1 Like

Makes sense. The migration job then needs to delete such items. There is no reason to waste energy repeatedly attemping to migrate somethign that is clearly an invalid piece.

Valid piece → put to log → delete.
Invalid piece → delete.

3 Likes

I do not think, that’s would be smart, it’s an additional one-time-used code

Especially, while we have your guide :slight_smile:

I would just put new-old-comers, how to deal with it to this guide :slight_smile:

All new would have hashstore as a default, so no need to explain, how to migrate and what’s could be cleaned.

All old will have this topic, how to handle. So - NoOp.

I migrated one of my node to hashstore. It is runnig on Windows op system.
The day after the migration finished, I started to get such error messages in the log:

|2025-09-19T21:22:10+02:00|ERROR|blobscache|piecesTotal < 0|{piecesTotal: -21760}|
|2025-09-19T21:22:10+02:00|ERROR|blobscache|piecesContentSize < 0|{piecesContentSize: -21248}|

After the migration finished, I stopped the node and changed the config lines to these:

pieces.enable-lazy-filewalker: false
storage2.piece-scan-on-startup: true
# pieces.file-stat-cache: badger

All the .sj1 files from the sub-folders of the blobs folder was deleted during the migration. I did not had to delete anything and I have all the folders still there. Previously I used badger cache with this node, but after the migration, I changed it in the config file.

It is kind of a trial node for me, so it is on an external SSD. The start-up filewalker runs quite quickly, I have started and finished line in the log for all sats after the node start.

So now I have a LOT of such error lines in the log and they are repeating exactly every hour.

How can I solve this error?

I wrote false into every .migrate_chore. Anything else needs to be done?

I think that nothing is needed, even this one.

Before I did that I kept seing several millions per second metadata requiests spikes from ARC every 10 minutes. I think it was migration chore waking up and enumerating 4000+ empty folders.

I since deleted the folders, and the spikes disappeard; but I guess there is no need for it to wake up ever again to an empty folder either, so I put false in the migration_chore files.

I though there woudl be some flag like “forget piece store and migrations existed, just use hashstore exclusively”

1 Like

Perhaps it would be, when piecestore would be disabled by default.

Sounds about right. Then it makes sense to delete these subfolders and disable the active migration.

1 Like

For newbies, could a kind-hearted operator/dev give us the right commands for linux and Windows how to check if the blob folders are empty and how to remove them?

I use rmdir on Linux. Rmdir only removes the folder if it is empty
So go to the blobs folder and run

rmdir -v */*

The -v is verbose

Sure,

See the command in my post above:

find /mnt/storagenode/blobs -type f -exec stat -f '%z %N' {} \;

will list files, and execute stat command on them, formatting the output to only include size and Name.

Ideally you want to see no output, but if there would be an output – check that all sizes are zeros.

You can also tell find to only show filzes of sizes greater than zero to begin with – we dont’ care if 0-sized files exist, we’ll nuke them anyway:

find /mnt/storagenode/blobs -type f -size +0

In the same vein if you only want to count directories, you can do:

find /mnt/storagenode/blobs -type d | wc -l
for zsh users

For zsh users: you can acomplish all of it without relying on awkward search just by using powerful glob syntax.

For example

ls -l /mnt/storagenode/blobs/**/*(.L1)

or

stat -f '%z %N' /mnt/storagenode/blobs/**/*(.L1)

This will expand into recursive (**/*) match of all files (.), that are at least (L) 1 byte long.

Basically, read man zshexpn and get blown away.

To remove everythign under blobs folder but not the folder itself use this:

rm -rf /mnt/storagenode/blobs/*

Your shell will expand * into subfolder names of the folder blobs, and remove command will delete them forcefully and recursively.

2 Likes

You can use the same command like @lyoth suggested in cmd.exe, or this one from the PowerShell

Get-ChildItem -Path "X:\storagenode\storage\blobs" -Recurse -Directory | Where-Object { (Get-ChildItem -LiteralPath $_.FullName).Count -eq 0 } | Remove-Item -WhatIf

it should print only empty directories.
Then you can remove -WhatIf option and execute the same command again to actually delete them.

1 Like

You should write user guides. :sweat_smile:
This is very comprehensive and easy to follow. :face_blowing_a_kiss:

1 Like

Well, answer to my own question…
I followed @arrogantrabbit’s suggestion and I wrote false into every .migrate_chore files.
The errors are gone.

I cleaned 5 nodes. There were no files, only empty directories, 3076 per node. This is why you need clean shutdowns aka an UPS… no damaged pieces, zero byte files, etc.
On the Synology 8GB RAM, scanning the blobs folder took between 30-60min per node. Why it is so slow Both nodes were stopped.
This is the effect of using only 8GB RAM for 3000 empty folders?