Unsure of audits and whether i'm doing something wrong

Hi guys,

Firstly, i’m sorry. I’m still getting my head around this, so sorry if I am asking dumb questions…

So my setup is a server with 8x2TB drives in RAID5 giving around 12TB of usable space, running Win10 pro on an i7 with 16GB RAM. My OS is on another drive so all that space will be used for Storj.

I have been running my node for 2.5 months, and since my PC is also acting as a storage for my personal stuff (on a separate drive again), it’s on 24/7 and that’s fine. I’ve had some downtime with my internet and a hardware configuration change so i’ve had probably missed the 99.4% uptime goal, but such is life.

I ran the script posted to check audits, and got the below results:

========== AUDIT =============
Successful: 2588
Recoverable failed: 1448
Unrecoverable failed: 1
Success Min: 64.1070101560565%
Success Max: 99.9613750482812%
========== DOWNLOAD ==========
Successful: 11355
Failed: 9087
Success Rate: 55.5474024068095
========== UPLOAD ============
Successful: 39902
Rejected: 29476
Failed: 151588
Acceptance Rate: 57.5139093084263
Success Rate: 20.8376416523056
========== REPAIR DOWNLOAD ===
Successful: 0
Failed: 0
Success Rate: 0
========== REPAIR UPLOAD =====
Successful: 0
Failed: 0
Success Rate: 0

So what’s a recoverable failed audit? One I failed for whatever reason, but then was able to complete when I was back online?

I’m not sure if they are good or bad results. I mean, I have looked at other posts who did have issues, and my numbers are better than theirs were. But can I get your opinion on these, and see if I should be looking to improve them?

I think my physical location might be the contributing factor. I live in Australia, and not near a major city. I have 100/40 internet, but I suspect since i’m not close to the source for the data i’m probably missing out purely because I can’t reply fast enough (see? Told you I was getting my head around it)

So what are your thoughts? Doing OK, considering my downtime this month and last month was over 10 hours each? Considering my location? Or is something in the way i’ve set myself up possibly hurting my results?

I greatly appreciate all the help i’ve received so far, and to the devs: keep up the good work and thanks for your patience!

When the node gets an audit request, it has limited time to respond. If the node (or the satellite) is overloaded, the request times out. The satellite puts your node in containment and will retry the same request later. This is a recoverable failure, since you may succeed some other time.
Unrecoverable failure is when the node cannot find the data on the drive.

It’s pretty hard to judge these numbers, since it differs a lot per setup and indeed location. What worries me a little is your rate of recoverable audit failures. You’re timing out on roughly 36% of audits. Audits are retried 3 times. The chances of failing all 4 attempts with your numbers in about 1.6%.
EDIT: @littleskunk pointed out below this has been corrected and only 3 total attempts are made. Giving you a 4.6% chance to fail all 3.
You really want to prevent that. Additionally your success rates on uploads and downloads are fairly low as well. Though that is a little less worrying.

It looks to me like in addition to location, you may have an IO bottleneck. You might benefit from lowering the concurrency setting. Have you changed this setting already? If so, what did you set it to?

If you’re using the default settings I suggest lowering it to 4 and see what happens. If you raised it before, try cutting it in half.

More info on how to tune this setting can be found here.

In total you will get 3 attempts. The 4th attempt was corrected in a previous version.

A stacktrace could tell us more about the reasons these downloads are taking so long. Do we have an FAQ for that as well? curl http://localhost:7777/mon/trace/svg?regex=Download > download.svg

Thanks for that correction.

Not sure about an FAQ. But for that command to work, the debug port needs to be enabled and forwarded. I posted some instructions for that here.

1 Like