Script: Calculate Success Rates for Audit, Download, Upload, Repair

Windows/Powershell version of the script:

Linux/Bash version of the script:

Grafana dashboard monitor of the script:
[TIG stack]

Daily Email Version:

Historic Post:
I found the current monitoring scripts very difficult to determine trends. And I was feeling like learning some better .bash. I created this script to better measure the ratios of successful transfers over time. I plan to have this run periodically, and send the results to a influxDB for Grafana monitoring.

Please share your results and hardware!

It currently runs against the docker logs since last RUN.

Of course, this presents new challenges when measuring historically (all-time, since last storagenode run, or last 24hrs). I still have to think how that might work, but at least for now, this could be used to see if it’s trending up or down within short periods of time.

Example;

  1. Save to AuditRates.sh
  2. sudo chmod +x AuditRates.sh
  3. sudo ./AuditRates.sh
5 Likes

How install bc on Synology ?

I don’t have access to Synology, but BC could be substituted in the script. It’s serving a math function to float decimal points. This is because bash only supports integer rounding (no decimal places) If another package you have can do that i can modify the script. I’m usually in powershell, so I’m probably not using best practice.

We could also look into going cross platform through a docker container if demand is high enough.

Could you add “upload rejected”? That was added with the last storage node release.

2 Likes

Made some tweaks.

  1. Showing more output of in between numbers.
  2. Moved the log command to a variable so it can be edited by SNO’s who have log being output to a file or use a different container name.
  3. Split success rate for audit to a min and max based on recoverable vs unrecoverable errors. The max really should always be 100%.
  4. Replaced bc with awk so it will work on more systems. (Including Synology)
  5. Added accepted rate for uploads based on rejected log lines.

Output looks like (Edit: made it more colorful):
image

2 Likes
Rejected:             0
awk: cmd. line:1: (FILENAME=- FNR=1) fatal: division by zero attempted

The first version still works fine. I might be missing some necessary little program, had to install bc to get the first version to work.

You’re not missing anything. It’s somehow trying to divide by 0. Can you show a bit more of your output so I can see the numbers it’s trying to work with?

Especially if one of the Successful lines is 0.

========== AUDIT =============            
"docker logs" requires exactly 1 argument.
See 'docker logs --help'.
                                       
Usage:  docker logs [OPTIONS] CONTAINER
                             
Fetch the logs of a container
Successful:           0                                               
"docker logs" requires exactly 1 argument.
See 'docker logs --help'.                                             
                            
Usage:  docker logs [OPTIONS] CONTAINER
                                          
Fetch the logs of a container
Recoverable failed:   0
"docker logs" requires exactly 1 argument.
See 'docker logs --help'.
                             
Usage:  docker logs [OPTIONS] CONTAINER
                                          
Fetch the logs of a container
Unrecoverable failed: 0
awk: cmd. line:1: (FILENAME=- FNR=1) fatal: division by zero attempted
Success Rate Min:     0.000%
awk: cmd. line:1: (FILENAME=- FNR=1) fatal: division by zero attempted
Success Rate Max:     0.000%
========== DOWNLOAD ==========                                        
"docker logs" requires exactly 1 argument.
See 'docker logs --help'.     
                                          
Usage:  docker logs [OPTIONS] CONTAINER

Fetch the logs of a container          
Successful:           0
"docker logs" requires exactly 1 argument.
See 'docker logs --help'.
                                          
Usage:  docker logs [OPTIONS] CONTAINER

Fetch the logs of a container          
Failed:               0
awk: cmd. line:1: (FILENAME=- FNR=1) fatal: division by zero attempted
Success Rate:         0.000%

That it didn’t work makes no sense, looking at the script, everything looks ok.

I tried adding an echo of the line about to be executed on the first command.

docker logs storagenode 2>&1 | grep GET_AUDIT | grep downloaded -c
"docker logs" requires exactly 1 argument.
See 'docker logs --help'.

Well, at least I know what to look for. I was testing with cat since I have my log output to a file. Give me a bit. Btw, I’m also adding if statements to catch division by 0 and adding a splash of color. :slight_smile:

Your vales are empty (default to 0), and brights script does not include sudo for the docker command. Did you sudo in the .sh? If so, you need to add your account to the docker group, or sudo to brights docker commands.

I’ve updated the script. Apparently bash doesn’t evaluate things like multiple commands or the > redirection within a variable. So I pulled it out of the LOG variable. While it is not necessary when using cat, it also doesn’t hurt. New version is up at the same link and should work now.

@subwolf My docker logs statement gives an empty result since I redirected the logs, so please test and report back.

Use github so I can pull direct.

That works great. Would you be up for pulling all data into a log file and using that for comparison? Say go back a few days to make it more accurate.

========== AUDIT ============= 
Successful:           528 
Recoverable failed:   4 
Unrecoverable failed: 0 
Success Rate Min:     99.248%
Success Rate Max:     100.000%
========== DOWNLOAD ========== 
Successful:           14636 
Failed:               77 
Success Rate:         99.477%
========== UPLOAD ============ 
Successful:           37163 
Rejected:             2207 
Failed:               465 
Acceptance Rate:      94.394%
Success Rate:         98.764%
========== REPAIR DOWNLOAD === 
Successful:           312 
Failed:               343 
Success Rate:         47.634%
========== REPAIR UPLOAD ===== 
Successful:           768 
Failed:               490 
Success Rate:         61.049%

Dumb question but what does docker pull by logs by default? Is there a rotation?

Docker retains logs for the current container without limits or rotation by default. So when you rm the container (which happens during update as well) you also remove the logs and start over.

This is why I output logs to a file. You can do that by adding the following lines to your config.yaml file and then restarting your node.

# output location for log
log.output: "/app/config/node.log"

What is the amount saved by default? Would prefer some kind of hours figure.

Github version here: https://github.com/ReneSmeekes/storj_success_rate
I also replaced the link in my previous post to this one. I’ll update here if I make changes from now on.

Bear in mind the instance had only been up 10 minutes after a reboot:

[sudo] password for subwolf: 
========== AUDIT ============= 
Successful:           52 
Recoverable failed:   0 
Unrecoverable failed: 0 
Success Rate Min:     100.000%
Success Rate Max:     100.000%
========== DOWNLOAD ========== 
Successful:           3938 
Failed:               12 
Success Rate:         99.696%
========== UPLOAD ============ 
Successful:           6761 
Rejected:             1553 
Failed:               126 
Acceptance Rate:      81.321%
Success Rate:         98.170%
========== REPAIR DOWNLOAD === 
Successful:           33 
Failed:               35 
Success Rate:         48.529%
========== REPAIR UPLOAD ===== 
Successful:           85 
Failed:               40 
Success Rate:         68.000%