Upload failed in docker logs and lot less traffic

This is mine from today, This is clearly something to do with how many nodes are fighting to get the file.
Less there is an issue but I dont believe there to be one, because of how low the traffic is vs how many nodes that are running.

========== AUDIT =============
            Successful:           2610
            Recoverable failed:   0
            Unrecoverable failed: 0
            Success Rate Min:     100.000%
            Success Rate Max:     100.000%
            ========== DOWNLOAD ==========
            Successful:           11142
            Failed:               2488
            Success Rate:         81.746%
            ========== UPLOAD ============
            Successful:           3488
            Rejected:             0
            Failed:               15704
            Acceptance Rate:      100.000%
            Success Rate:         18.174%
            ========== REPAIR DOWNLOAD ===
            Successful:           7756
            Failed:               3
            Success Rate:         99.961%
            ========== REPAIR UPLOAD =====
            Successful:           219
            Failed:               2898
            Success Rate:         7.026%

how can i get an AUDIT statistic like this? what scripts do you use?

You can check out https://github.com/ReneSmeekes/storj_success_rate by brightslience he made the script to check.

2 Likes

========== AUDIT =============
Successful: 1128
Recoverable failed: 0
Unrecoverable failed: 0
Success Rate Min: 100.000%
Success Rate Max: 100.000%
========== DOWNLOAD ==========
Successful: 10471
Failed: 298
Success Rate: 97.233%
========== UPLOAD ============
Successful: 1996
Rejected: 0
Failed: 17675
Acceptance Rate: 100.000%
Success Rate: 10.147%
========== REPAIR DOWNLOAD ===
Successful: 0
Failed: 0
Success Rate: 0.000%
========== REPAIR UPLOAD =====
Successful: 307
Failed: 3121
Success Rate: 8.956%

its badā€¦

Im seeing much lower traffic too but stats are not that bad yet.

========== AUDIT =============
Successful:           623
Recoverable failed:   0
Unrecoverable failed: 0
Success Rate Min:     100.000%
Success Rate Max:     100.000%
========== DOWNLOAD ==========
Successful:           1129
Failed:               516
Success Rate:         68.632%
========== UPLOAD ============
Successful:           2174
Rejected:             0
Failed:               641
Acceptance Rate:      100.000%
Success Rate:         77.229%
========== REPAIR DOWNLOAD ===
Successful:           613
Failed:               0
Success Rate:         100.000%
========== REPAIR UPLOAD =====
Successful:           236
Failed:               28
Success Rate:         89.394%

where is your node located?

Iā€™m in the same boat as most who have posted with ~55% download and ~34% uploadā€¦Iā€™m in the US. Looks like most of my traffic is coming from Europe satellite and I suppose my node is just not quick enough

The part that kind of gets me though is that a lot of these failed uploads are failing within a 1-2 seconds, which seems almost too quick for other nodes to complete the upload before mine does.

Located in US as well

Could this be a different type of stress test where the up-loader is intentionally cancelling the transfers early?

Mine has increased from a low of 4%. Still not the 90%+ I was seeing but an improvement. 1000/500 from Oceania

========== AUDIT ============= 

Successful: 298
Recoverable failed: 0
Unrecoverable failed: 0
Success Rate Min: 100.000%
Success Rate Max: 100.000%
========== DOWNLOAD ==========
Successful: 322
Failed: 18
Success Rate: 94.706%
========== UPLOAD ============
Successful: 3238
Rejected: 0
Failed: 15709
Acceptance Rate: 100.000%
Success Rate: 17.090%
========== REPAIR DOWNLOAD ===
Successful: 0
Failed: 0
Success Rate: 0.000%
========== REPAIR UPLOAD =====
Successful: 775
Failed: 62

A little info that might be relevant. It used to be that out of 95 upload transfers only 80 were finished and 15 cancelled. Roughly 84%. If you used to see success rates higher than that, you did better than average on the test traffic which came mostly from one location. Please note that this means other nodes did worse than average back then.

Since then, 3 things changed:

  • Test traffic has stopped and we now see traffic from all around the world. Nodes that used to do better than average likely saw a drop because their advantage with the test source doesnā€™t necessarily apply with other customers.
  • Instead of 95, now 110 transfers are started, but still only 80 finish, so 30 are cancelled. This means the average success rate has dropped to 73%.
  • The lower amount of total traffic may have caused previously bandwidth constraint nodes to become more competitive. They would have lost more races when there were multiple transfers going on on their node, but with only a single transfer they are better able to keep up. This may cause some high bandwidth connections or faster hardware to see less of an advantage than they used to.

Unfortunately for me, I saw a big drop as well. I did exceptionally well with the test traffic. About 97-99% success rates. I have now dropped to 47%. Iā€™m located in the Netherlands, so with test traffic from Germany I used to do pretty well. My guess is we now see quite a few US based customers testing as the two users reporting higher results were both from the US and they both scored above the new average.

Please also note that from the reports here it may seem like almost everyone saw a drop. This isnā€™t necessarily the case. People go to a forum to post about an issue. So drops in success rate are by definition over represented.

Edit: In order to make this post more complete I wanted to add that an additional change in how the uplink closes connections more aggressively after uploads are complete can cause failed uploads in the logs while the transfer actually completed. This can be seen in some cases because cancelled uploads are later deleted in the logs. So the piece was definitely there. The impact may be a lot smaller than it seems as a result of this. More info here.

In my experience update 0.33.4 improved the success rates a little again, but not back to the old level. So keep this in mind. Your node is probably working just fine. It looks worse than it is.

6 Likes

Now this is a proper explanation regarding the HUGE drops in upload success rates. Thank you.

1 Like

My satellite info says that most the Ingress traffic is on order from Europe Westā€¦ and most of the Egress traffic is on order from Stefanā€¦

After logging some of the connecting IPs for awhile, most of the customer connections seem to be originating in Europe.

Ingress Europe-West: 40.65 GB
Ingress Stefan: 9.20 GB
Ingress US Central: 3.5 GB
Ingress Asia-East: 2.51 GB


Egress Stefan: 19.87 GB
Egress Europe-West: 2.34 GB
Egress US Central: 410 MB
Egress Asia-East: 350 MB


My current success rate stats are nearly identical percentage wise to yesterday. And my current rate of earnings looks like about $0.01 per hour. Itā€™s a little bit of a let down from last monthā€¦ that was fun! At least I now know what the approximate max traffic looks like.

1 Like

Iā€™d say weā€™ve seen both extremes. Customers are only just getting started.

Thanks for the stats though, hadnā€™t looked into that myself. I canā€™t really rhyme that with the success rates weā€™re seeing on different nodes around the world. But I think my post still holds. Different sources means different nodes will see better results.

1 Like

In addition, I think a lot of nodes will retire with graceful exit as well, as theyā€™ll probably will not be willing to ā€œwaitā€ for it to grow, which in addition will increase traffic on the nodes that stay.

To counterbalance it, here are my stats from my biggest node (2.8TB out of 3.5TB filled, UK)
December:

========== AUDIT =============
Successful: 12720
Recoverable failed: 1
Unrecoverable failed: 0
Success Rate Min: 99.992%
Success Rate Max: 100.000%
========== DOWNLOAD ==========
Successful: 217749
Failed: 21904
Success Rate: 90.860%
========== UPLOAD ============
Successful: 858734
Rejected: 24761
Failed: 29605
Acceptance Rate: 97.197%
Success Rate: 96.667%
========== REPAIR DOWNLOAD ===
Successful: 14108
Failed: 2878
Success Rate: 83.057%
========== REPAIR UPLOAD =====
Successful: 3961
Failed: 38
Success Rate: 99.050%

January:

========== AUDIT =============
Successful: 12943
Recoverable failed: 0
Unrecoverable failed: 0
Success Rate Min: 100.000%
Success Rate Max: 100.000%
========== DOWNLOAD ==========
Successful: 191028
Failed: 168407
Success Rate: 53.147%
========== UPLOAD ============
Successful: 385385
Rejected: 0
Failed: 7406
Acceptance Rate: 100.000%
Success Rate: 98.114%
========== REPAIR DOWNLOAD ===
Successful: 35674
Failed: 6
Success Rate: 99.983%
========== REPAIR UPLOAD =====
Successful: 4790
Failed: 189
Success Rate: 96.204%

February:

========== AUDIT =============
Successful: 2173
Recoverable failed: 0
Unrecoverable failed: 0
Success Rate Min: 100.000%
Success Rate Max: 100.000%
========== DOWNLOAD ==========
Successful: 16191
Failed: 295
Success Rate: 98.211%
========== UPLOAD ============
Successful: 10485
Rejected: 0
Failed: 2558
Acceptance Rate: 100.000%
Success Rate: 80.388%
========== REPAIR DOWNLOAD ===
Successful: 7541
Failed: 1
Success Rate: 99.987%
========== REPAIR UPLOAD =====
Successful: 1274
Failed: 216
Success Rate: 85.503%

As you can see, successful downloads dropped quite a bit in January during the high traffic testing. But I canā€™t complain, still got ~3.5TB egress with all my nodes on my 80/20 MBit connection. Now with less traffic my success rate is back above average.

Like most others, my success rate for uploads did drop quite a bit in the last few days. However, I also noticed that, relative to successful uploads, my nodeā€™s reported ingress traffic has stayed about the same, while successful uploads decreased.

Ingress traffic is only tracked for successful uploads from what I can tell, so this means that Iā€™m receiving fewer successful uploads now, but the average size of each uploaded piece has increased substantially. Iā€™m actually receiving just about the same amount of data per second in successful uploads, even though the number of successful uploads has decreased.

root@server030:/logs/storagenode# grep "^2020-02-" server030-v0.31.12-*.log > /tmp/2020-02.log
root@server030:/logs/storagenode# ./successrate.sh /tmp/2020-02.log
========== AUDIT ============= 
Successful:           1599 
Recoverable failed:   0 
Unrecoverable failed: 0 
Success Rate Min:     100.000%
Success Rate Max:     100.000%
========== DOWNLOAD ========== 
Successful:           19683 
Failed:               41 
Success Rate:         99.792%
========== UPLOAD ============ 
Successful:           4987 
Rejected:             0 
Failed:               128 
Acceptance Rate:      100.000%
Success Rate:         97.498%
========== REPAIR DOWNLOAD === 
Successful:           440 
Failed:               7650 
Success Rate:         5.439%
========== REPAIR UPLOAD ===== 
Successful:           1897 
Failed:               24 
Success Rate:         98.751%
1 Like

Well apparently there is more going on.

So we may see higher failure rates than whatā€™s happening in reality.

2 Likes