Poor Success rate on hashstore

Maybe @Th3Van could switch some of his node to hashstore and report back?

And still.. somehow both nodes had more traffic and a higher completed count in total (could be variations in traffic on the network ofc.).

1 Like

Wed Feb 19 21:29:23 CEST 2025 : All 132 nodes has successfully migrated to hashstore in 525.34 hours (649.316.427.445.212 Bytes in 3.530.760.132 pieces)

2 Likes

Yeah it’s kind a bug, which should be solved. All four nodes are running on SSDs which are basicly idling and yet it is still 2% to 3% cancel-rate.

Ah great thanks. I would have assumed he will wait for the official rollout and didn’t check.
So now will we know which store performs better on his nodes?

I just had a look at one of my recently converted nodes (~8TB stored):

PIECESTORE                         HASHSTORE
========== AUDIT ==============    ========== AUDIT ==============
Critically failed:     0           Critically failed:     0
Critical Fail Rate:    0.000%      Critical Fail Rate:    0.000%
Recoverable failed:    0           Recoverable failed:    0
Recoverable Fail Rate: 0.000%      Recoverable Fail Rate: 0.000%
Successful:            13397       Successful:            26182
Success Rate:          100.000%    Success Rate:          100.000%
========== DOWNLOAD ===========    ========== DOWNLOAD ===========
Failed:                576         Failed:                4
Fail Rate:             0.097%      Fail Rate:             0.001%
Canceled:              3039        Canceled:              2419
Cancel Rate:           0.510%      Cancel Rate:           0.391%
Successful:            591809      Successful:            617004
Success Rate:          99.393%     Success Rate:          99.609%
========== UPLOAD =============    ========== UPLOAD =============
Rejected:              0           Rejected:              0
Acceptance Rate:       100.000%    Acceptance Rate:       100.000%
---------- accepted -----------    ---------- accepted -----------
Failed:                82          Failed:                0
Fail Rate:             0.074%      Fail Rate:             0.000%
Canceled:              1461        Canceled:              1576
Cancel Rate:           1.323%      Cancel Rate:           1.072%
Successful:            108921      Successful:            145477
Success Rate:          98.603%     Success Rate:          98.928%
========== REPAIR DOWNLOAD ====    ========== REPAIR DOWNLOAD ====
Failed:                0           Failed:                2
Fail Rate:             0.000%      Fail Rate:             0.003%
Canceled:              0           Canceled:              3
Cancel Rate:           0.000%      Cancel Rate:           0.004%
Successful:            57359       Successful:            76860
Success Rate:          100.000%    Success Rate:          99.994%
========== REPAIR UPLOAD ======    ========== REPAIR UPLOAD ======
Failed:                0           Failed:                0
Fail Rate:             0.000%      Fail Rate:             0.000%
Canceled:              69          Canceled:              74
Cancel Rate:           1.254%      Cancel Rate:           0.975%
Successful:            5434        Successful:            7516
Success Rate:          98.746%     Success Rate:          99.025%
========== DELETE =============    ========== DELETE =============
Failed:                0           Failed:                0
Fail Rate:             0.000%      Fail Rate:             0.000%
Successful:            0           Successful:            0
Success Rate:          0.000%      Success Rate:          0.000%

This is a host similar til @arrogantrabbit setup with ZFS (2x vdev on raidz1 each with 4 rust disks + dedicated ssd special mirror), plenty of RAM and performance headroom.
One difference however - it’s connected to a fibre with public IP, so no VPN in the loop.

My conclusion is that - for my setup - there’s an improvement in successrates:

+0.2% DOWNLOAD
+0.3% UPLOAD

Perhaps we need to factor in the geographical location of these nodes as well ? This host is located in the nordic part of EU.

1 Like

Can anyone provide me some documentation on how to use / run the success rate script?
Cheers in advance.

You must have log on info

1 Like

I have two notes here at the same IP. Similar data sizes, one is using memtbl(but old storage not migrated) and the other is pieces store.

Download success rate is slightly better on memtbl, 99.5% vs 98.2%
Upload rates are similar, 99.6% for both.

Another much smaller node, same IP, but entirely migrated to memtbl, has download rate 99.8% and upload of 99.9%

another node is on a different IP, has good ram and CPU, but slow NFS-network mount storage. it’s on memtbl but most data is still on pieces store. Download is 97.5% and upload is 99.1%

Then I have two potato nodes. only 1 core. only 1GB ram. slow NFS mount storage. They are using hashtbl for new storage but most data is still on pieces storage.

potato 1 has download success of 90.8% and upload success of 91.5%
potato 2 has download success of 98.5% and upload success of 97.8%

potato 1 has less data migrated to hashstore, but i’m not sure on the reason for it’s higher cancel rate.

How did you activate memtable now? I asked around, but didn’t receive any concrete answer as of now.

A ā€œmonthā€ turned out to be an exaggerations. I do have 20 days worth of logs though.
Here: Node in WA

root@storagenode-seven:~ # zsh -c 'for f in /var/log/storagenode.*.bz2(n); do echo $f: $({bzcat $f > /tmp/1.txt && ./successrate.sh /tmp/1.txt} | grep -A 5 accepted | grep "Cancel Rate"); done'
/var/log/storagenode.log.0.bz2: Cancel Rate: 2.529%
/var/log/storagenode.log.1.bz2: Cancel Rate: 2.593%
/var/log/storagenode.log.2.bz2: Cancel Rate: 1.916%
/var/log/storagenode.log.3.bz2: Cancel Rate: 2.186%
/var/log/storagenode.log.4.bz2: Cancel Rate: 3.060%
/var/log/storagenode.log.5.bz2: Cancel Rate: 1.345% <--- switch to hashstore
/var/log/storagenode.log.6.bz2: Cancel Rate: 0.033%
/var/log/storagenode.log.7.bz2: Cancel Rate: 0.035%
/var/log/storagenode.log.8.bz2: Cancel Rate: 0.069%
/var/log/storagenode.log.9.bz2: Cancel Rate: 0.085%
/var/log/storagenode.log.10.bz2: Cancel Rate: 0.178%
/var/log/storagenode.log.11.bz2: Cancel Rate: 0.124%
/var/log/storagenode.log.12.bz2: Cancel Rate: 0.110%
/var/log/storagenode.log.13.bz2: Cancel Rate: 0.160%
/var/log/storagenode.log.14.bz2: Cancel Rate: 0.133%
/var/log/storagenode.log.15.bz2: Cancel Rate: 0.073%
/var/log/storagenode.log.16.bz2: Cancel Rate: 0.034%
/var/log/storagenode.log.17.bz2: Cancel Rate: 0.060%
/var/log/storagenode.log.18.bz2: Cancel Rate: 0.095%
/var/log/storagenode.log.19.bz2: Cancel Rate: 0.023%

Node in CA:

storj-eight# zsh -c 'for f in /var/log/storagenode.*.bz2(n); do echo $f: $({bzcat $f > /tmp/1.txt && ./successrate.sh /tmp/1.txt} | grep -A 5 accepted | grep "Cancel Rate"); done'
/var/log/storagenode.log.0.bz2: Cancel Rate: 1.421%
/var/log/storagenode.log.1.bz2: Cancel Rate: 1.968%
/var/log/storagenode.log.2.bz2: Cancel Rate: 1.284%
/var/log/storagenode.log.3.bz2: Cancel Rate: 1.617%
/var/log/storagenode.log.4.bz2: Cancel Rate: 1.686%
/var/log/storagenode.log.5.bz2: Cancel Rate: 0.124% <-- switch to hashstore
/var/log/storagenode.log.6.bz2: Cancel Rate: 0.057%
/var/log/storagenode.log.7.bz2: Cancel Rate: 0.037%
/var/log/storagenode.log.8.bz2: Cancel Rate: 0.049%
/var/log/storagenode.log.9.bz2: Cancel Rate: 0.058%
/var/log/storagenode.log.10.bz2: Cancel Rate: 0.100%
/var/log/storagenode.log.11.bz2: Cancel Rate: 0.075%
/var/log/storagenode.log.12.bz2: Cancel Rate: 0.077%
/var/log/storagenode.log.13.bz2: Cancel Rate: 0.108%
/var/log/storagenode.log.14.bz2: Cancel Rate: 0.072%
/var/log/storagenode.log.15.bz2: Cancel Rate: 0.048%
/var/log/storagenode.log.16.bz2: Cancel Rate: 0.033%
/var/log/storagenode.log.17.bz2: Cancel Rate: 0.052%
/var/log/storagenode.log.18.bz2: Cancel Rate: 0.052%
/var/log/storagenode.log.19.bz2: Cancel Rate: 0.022%

So… I don’t think it’s a fluke. It’s on two different (albeit configured by the same dude) servers

2 Likes

To add a broader perspective, I’ve done similar on my node (from previous post).

2025-09-10  Cancel Rate: 0.868%
2025-09-09  Cancel Rate: 1.163%
2025-09-08  Cancel Rate: 1.072%
2025-09-07  Cancel Rate: 4.122% <-- IGNORE    | FAILED HDD
2025-09-06  Cancel Rate: 1.905%               | RESILVER ON POOL
2025-09-05  Cancel Rate: 0.986%
2025-09-04  Cancel Rate: 1.396%
2025-09-03  Cancel Rate: 1.106%
2025-09-02  Cancel Rate: 1.033%
2025-09-01  Cancel Rate: 0.877%
2025-08-31  Cancel Rate: 1.242%
2025-08-30  Cancel Rate: 0.898% <-- HASHSTORE MIG COMPLETE
2025-08-29  Cancel Rate: 0.925%
2025-08-28  Cancel Rate: 1.235%
2025-08-27  Cancel Rate: 1.024%
2025-08-26  Cancel Rate: 1.135%
2025-08-25  Cancel Rate: 1.040%
2025-08-24  Cancel Rate: 1.052%
2025-08-23  Cancel Rate: 1.235%
2025-08-22  Cancel Rate: 1.391%
2025-08-21  Cancel Rate: 1.280%
2025-08-20  Cancel Rate: 1.306%
2025-08-19  Cancel Rate: 1.203%
2025-08-18  Cancel Rate: 1.067%
2025-08-17  Cancel Rate: 1.146%
2025-08-16  Cancel Rate: 1.323%
2025-08-15  Cancel Rate: 1.391%

With this more detailed historical outlook, it’s not so clear to me anymore.

Avg of rates before and after migration:

Piecestore  1,184%
Hashstore   1,141%

Difference is a mere +0,043% :face_with_raised_eyebrow:

1 Like

does someone tried defragmentation after migrates, on ntfs it is 99% all fragmented

My ZFS Pool fragmentation is steady ~27% before and after migration.

Hmm.. interesting. So your cancel rate was high to begin with. And did not change. Mine was very small, and the change is noticeable.

I just realized that since storagenode does small appends to huge files, I shall reduce record size from 128k to say 32, or 16k, to better align with the usage pattern. Maybe this will help? I shall try that.

Truth be told, the storage node isn’t in a position to know if an upload is successful or not. The peer that knows if an upload to a storagenode was successful is actually the Satellite (and the Uplink temporarily by proxy, but it has no consistent memory). A storagenode may think an upload to it was successful, but only the Satellite keeps track of which nodes were actually part of the fastest set. Even if a node never gets a cancelation and by all appearances the upload looks successful, unless the Satellite agrees, that data is considered garbage. The uplink attempts to alert the storage node if it was unsuccessful in a variety of ways, and whenever the Uplink can tell the storage node it lost the race, that is good, because the storage node then can preemptively clean up that data instead of leaving it around and waiting for garbage collection.

So! We have a possible hypothesis (though we still need to collect data to figure out if it’s right). Hashstore has a much smaller critical section that waits for disk activity than piecestore does. Perhaps hashstore is better at receiving cancelation requests than piecestore is? Perhaps piecestore has an unreasonably high success rate because a higher percentage of uploads are false successes? In this scenario, a lower success rate may actually simply mean a more accurate success rate.

So here is what we need to gather and check (and perhaps forum readers can help):

  • What is the percent of unsuccessful uploads we actually expect in practice due to long tail cancelation? In theory, this is as high as 30% (!), because uploads to the network upload 70 pieces per segment and only wait for the fastest 49. So, 21/70 pieces are theoretically canceled, but do we sometimes do better than 49? How often?
  • How often is very-recently-uploaded data immediately eligible for garbage collection? For hashstore? For piecestore?
  • What percent of pieces are considered successful across all nodes? Hashstore nodes, piecestore nodes? It should match the Satellite, right? If it’s higher than what the Satellite tracks as successful, then we have ā€œfalseā€ successes.

Basically, a network-wide average success rate of 100% is actually bad. We always upload more than we need to be successful, and, theoretically, across all nodes (some nodes will be ā€œsuccessfulā€ less often than others, i.e. win less races) we should be seeing about 70% piece success across all pieces (49/70). If the network as a whole is reporting average scores higher than 70%, then that means the network is gaining garbage at a faster rate than we expect. 30% of uploads should be considered failed by the nodes, so the nodes can clean that data up instead of letting it sit around unpaid until garbage collection.

Suffice it to say one thing we’ve been noticing on the hashstore-based Select network is less garbage. Perhaps this is why, and perhaps we shouldn’t be afraid of lower-than-piecestore success rates? But this is just a hypothesis, we’ll try to disconfirm it.

Edited to add: an individual storage node can certainly be more successful and try to target a 100% success rate. Any success rate above the network average means that it is more likely to win races than the average node. Any success rate below the network average means that it is less likely to win races. But the network average should very definitely not be 100%, or something has gone wrong and all the nodes are storing way more data than they are getting paid for (garbage). So I would expect the average node even here on the forum is losing races in double digit percents. This should be showing in both hashstore and piecestore. If hashstore says this and piecestore doesn’t, then I’m actually more suspicious of piecestore.

7 Likes

This makes a lot of sense – unless satellite is very good with node selection, and my nodes are significantly faster than everyone else’s (they aren’t, if anything, my cable internet has quite a high latency compared to fiber that literally everyone around me is able to get).

This hypothesis (that piecestore cancellation rate as seen by node is inaccurate) shall be quite easy to validate: can you please check satellite side, what was the actual success rate on this node historically? Was it actually 99.9% and/or arecurrently reported values (97-98%) closer to actual, or is everything BS?
12nRLLozTqKdD5KRjLu23C8Xz6ZKxkzngpRhoKUZtfruDcCMpar

If historical data is not available – here is my other node that still runs piecestore, that also has unrealistically low cancel rate:
1zWMZAUsyxpi1v9J5Me9ukcdKiaCDEpeftiTSdcrXJh7RvHEyf

========== AUDIT ============== 
Critically failed:     0 
Critical Fail Rate:    0.000%
Recoverable failed:    0 
Recoverable Fail Rate: 0.000%
Successful:            36427 
Success Rate:          100.000%
========== DOWNLOAD =========== 
Failed:                1736 
Fail Rate:             0.345%
Canceled:              2990 
Cancel Rate:           0.593%
Successful:            499111 
Success Rate:          99.062%
========== UPLOAD ============= 
Rejected:              0 
Acceptance Rate:       100.000%
---------- accepted ----------- 
Failed:                1 
Fail Rate:             0.001%
Canceled:              38 
Cancel Rate:           0.024%
Successful:            155155 
Success Rate:          99.975%
========== REPAIR DOWNLOAD ==== 
Failed:                0 
Fail Rate:             0.000%
Canceled:              2 
Cancel Rate:           0.003%
Successful:            71758 
Success Rate:          99.997%
========== REPAIR UPLOAD ====== 
Failed:                0 
Fail Rate:             0.000%
Canceled:              43 
Cancel Rate:           0.695%
Successful:            6147 
Success Rate:          99.305%
========== DELETE ============= 
Failed:                0 
Fail Rate:             0.000%
Successful:            0 
Success Rate:          0.000%

This does make a lot of sense. But why the stark difference between normal uploads and ā€œrepair uploadsā€. Do repairs have a even smaller critical section? Or better at receiving cancellations notifications?

Small appends, yes, but they’re not synced (unless you set STORJ_HASHSTORE_STORE_SYNC_WRITES, which is not the default), so OS is free to wait until there’s an actually sizable chunk of data to write. With some luck, OS might accumulate dozens of megabytes before making actual I/O.

3 Likes