[Tech Preview] Hashstore backend for storage nodes

snorkel · August 24, 2025, 9:42am

Did you all stop the node before making any modifications?

Mitsos · August 24, 2025, 9:51am

Always

(twenty chars)

alpharabbit · August 24, 2025, 9:55am

I had one case where the migrate_chore file was reset. I guess this was because I did the edit before stopping and restarting the node.

mike · August 24, 2025, 10:12am

Could this be because your nodes updated to a version that defaults to storing new data in hash and that the defaults are not to do an active migration (yet).

I havn’t checked my converted nodes yet for status. But it would be a little annoying if we have an edge case where already migrated nodes reverts due to the general migrations settings rolling out..?

@Alexey

mike · August 24, 2025, 10:17am

Just checked - none of my nodes changed settings autonomously

RecklessD · August 24, 2025, 11:15am

No, applied changes in ‘meta’ folder then restarted.

jammerdan · August 24, 2025, 11:44am

That’s what I have been doing too and it seems that on some nodes this is not enough.
The node on which I followed suggestion from @Alexey has not yet reverted again. So I keep my fingers crossed.
While another node where I changed without prior stop and remove, seems to keep writing false into that file. It even seems like some kind of periodic write no matter if the file contains already false. I see a modification from like 1 hour ago but I know it was already set to false before that.

snorkel · August 24, 2025, 12:01pm

I will continue posting here, not to double the conversation on the announcement thread.
A fellow SNO, who stumbled upon hashstore bad performance aswell, contacted me and suggested this; maybe devs could take it into account and make some recommendations:

ext4 4K blocksize is a problem. 
I went to 64K blocksize with XFS filesystem 
and it was night/day difference for a disk only holding storjdata. 
I did an average over 24 nodes for file size distribution:
Average filesize distribution for 24 Storj nodes 2025-08-20
1k: 74
2k: 37
4k: 30
8k: 40
16k: 24
32k: 10
64k: 52
128k: 69
256k: 52
512k: 80
1M: 73
2M: 83
4M: 100
8M: 38
16M: 15
32M: 5
64M: 6
128M: 7
256M: 10
512M: 10
1G: 4372
2G: 1

I want to mention that my hashstore nodes have 512 block size, the Seagate default. I have 4K nodes aswell, but I didn’t test the hashstore on them.
With 1GB log files, it makes sense to use 64K block size, but who knew 5 years ago?

snorkel · August 24, 2025, 12:16pm

I looked at success rate on both nodes with hashstore running, and I found lines about piecestore; I don’t know if they are the same for hashstore. I will send the entire stats to Alexey to take a look.
To obtain the success rate, I searched for these lines:

upload_started_count
upload_success_count
download_started_count,action=GET,
download_success_count,action=GET,

I got:

node 1: upload success rate=94,89%, download success rate=99,08%
node 2: upload success rate=94,19%, download success rate=98,39%

If these are true, they are no different from the piecestore nodes.

Another thing that I can think of would be these:

sysctl -w net.core.rmem_max=7500000
sysctl -w net.core.wmem_max=7500000

My system is Synology DS216+ 8GB RAM, 2 Ironwolf 8TB drives, ext4 -noatime, no RAID, 4TB occupied, 7TB allocated. Maybe these buffers are to high for this old system?
These 2 nodes stored in july 140GB and 160GB, same IP.

Aleksman4o · August 24, 2025, 1:29pm

formatting in 512 for more than 10 years doesn’t make sense. this sector size is only needed for very old systems. in other cases, it will only bring performance degradation due to read-modify-write on 4k physical sectors.
512 could’t be a seagate default in any way. All up to date OS will format this disks in 4k.

snorkel · August 24, 2025, 1:47pm

Even the newest Exos drive comes in 512e format. If you are not aware of this technicality, you could end up using it as is. Syno dosen’t format it in 4K.

Mitsos · August 24, 2025, 1:54pm

It may have been an improperly aligned partition (ie doesn’t start on a 4K sector). The 64K XFS comment is completely irrelevant.

Most (all, unless expressly bought against this) disks are 4K sectors on the platter (true for the past 10 years), and emulate 512B sectors to present to the operating system. That means when the OS wants to change data in one 512B sector, the OS thinks it’s writing just one 512B sector, while in reality a 4K sector is being written.

If a partition isn’t aligned to 4K (ie starts on the 3rd 512B sector) for each two consecutive 512B sectors (overlapping sector boundary) being changed by the OS, 2x4K sectors are being re-written.

Aleksman4o · August 24, 2025, 1:55pm

Syno has fallen even lower in my eyes

jammerdan · August 24, 2025, 2:09pm

I can confirm this now. Every hour the file gets written/modified to false.

Aleksman4o · August 24, 2025, 2:12pm

Like with the most of software, where you trying to change config while it running. I’ve migrated about 30 nodes months ago without such problems.

jammerdan · August 24, 2025, 2:16pm

No I think it’s different:

Different nodes behave differently. I see nodes that do the reset, others that don’t both had their settings changed the same way.
I did change the setting and restarted (=container stopped & removed). So the setting in the file was true when it started up again. So actually no reason to behave differently.

Alexey · August 24, 2025, 2:17pm

I can guess, but seems something has changed since it was an opt-in.

Toyoo · August 24, 2025, 2:19pm

Interesting, as this would mean your ext4 setup is just not handling extent allocations well. Could you try using the delalloc (ext4’s delayed allocation feature) flag?

With the power-of-two-choices algorithm on satellite side, even tiny differences would result in a significant difference in the amount of traffic directed to your node. Can you compare the raw upload_started_count metrics too?

snorkel · August 24, 2025, 3:37pm

I don’t know how and what is that: delalloc.
Compare that with a piecestore node? No point. The stats are reseting with each restart, I believe, or with a container recreation. I should have both nodes restarted at the same time and let them like a week, maybe…

jammerdan · August 24, 2025, 5:17pm

No more hourly reset after changing the value after the container had been stopped and removed.