Script: Calculate Success Rates for Audit, Download, Upload, Repair

After 11h30m uptime:

Sun Jul 14 03:04:34 EDT 2019
========== AUDIT ============= 
Successful:           417 
Recoverable failed:   0 
Unrecoverable failed: 0 
Success Rate Min:     100.000%
Success Rate Max:     100.000%
========== DOWNLOAD ========== 
Successful:           36063 
Failed:               188 
Success Rate:         99.481%
========== UPLOAD ============ 
Successful:           60919 
Rejected:             2792 
Failed:               1340 
Acceptance Rate:      95.618%
Success Rate:         97.848%
========== REPAIR DOWNLOAD === 
Successful:           148 
Failed:               273 
Success Rate:         35.154%
========== REPAIR UPLOAD ===== 
Successful:           671 
Failed:               248 
Success Rate:         73.014%

After 15h23m

========== AUDIT =============
Successful:           1175
Recoverable failed:   1
Unrecoverable failed: 0
Success Rate Min:     99.915%
Success Rate Max:     100.000%
========== DOWNLOAD ==========
Successful:           41331
Failed:               36
Success Rate:         99.913%
========== UPLOAD ============
Successful:           68434
Rejected:             0
Failed:               1199
Acceptance Rate:      100.000%
Success Rate:         98.278%
========== REPAIR DOWNLOAD ===
Successful:           265
Failed:               762
Success Rate:         25.803%
========== REPAIR UPLOAD =====
Successful:           710
Failed:               344
Success Rate:         67.362%

Nice work on the revision @BrightSilence!

I think i’ll fork yours and do a few things going forward.

20 hours
========== AUDIT =============
Successful: 1234
Recoverable failed: 1
Unrecoverable failed: 0
Success Rate Min: 99.919%
Success Rate Max: 100.000%
========== DOWNLOAD ==========
Successful: 46480
Failed: 551
Success Rate: 98.828%
========== UPLOAD ============
Successful: 78323
Rejected: 124
Failed: 1853
Acceptance Rate: 99.842%
Success Rate: 97.689%
========== REPAIR DOWNLOAD ===
Successful: 437
Failed: 1068
Success Rate: 29.036%
========== REPAIR UPLOAD =====
Successful: 747
Failed: 301
Success Rate: 71.279%

can I use this script in Windows somehow ?

Unfortunately the script is Linux only for now

Ok, thanks! Be wait for win edition :slight_smile:

I’ve sent a pull request with a minor modification for being able to pass the log file as a first argument of the script for being able to run it through a docker container.

The drawback of running it through a docker container is that you have to log to a file.

That’s awesome, thanks for the contribution! I’ve merged the PR.

1 Like

I suggest that like the Zabbix Storj script, the time since last run has a default of 24 hours, but that can be overidden, like:

LASTRUN="24h"
#Log line can be edited using cat for SNO's who wrote their log to a file.
LOG="docker logs storagenode --since $LASTRUN"

Taken from these few lines in the Zabbix script:

if [ -f /tmp/zabbix-storagenode-stats ]
then
	LastRun=$(stat -c %y /tmp/zabbix-storagenode-stats | awk '{print $1"T"$2}')
else
	LastRun="1m"
fi

OLDIFS=$IFS

IFS=$'\n' StorjLogs=$(docker logs --since $LastRun storagenode 2>&1)

The Powershell version of that script: https://github.com/AlexeyALeonov/success_rate
Hardcoded name of the container and without possibility to parse the log from file

3 Likes

Nice work! I updated the first post with our different versions

is it only for linux? or can somehow on windows also?

tried this in Powershel, but got error, runing scripts disabled on this system

Ok this will bypass restriction powershell -ExecutionPolicy ByPass -File successrate.ps1 -Path x:\storagenode\node.log

But it working very hard, log is 400Mb but Process is taking 30% CPU and 4GB RAM working slow.

What period of time is contained in that log file? If it’s many months it can take a while. It’s churning through that log to parse out all uploads and downloads.

it contains about 25days from 29.12.19 and 400 mb log file.

@Alexey pro vash skript

$auditsFailed = ($log | Select-String GET_AUDIT | Select-String failed | Select-String open -NotMatch).Count
“Recoverable failed:`t” + $auditsFailed

$auditsFailedCritical = ($log | Select-String GET_AUDIT | Select-String failed | Select-String open).Count
“Unrecoverable failed:`t” + $auditsFailedCritical

razve esli audit open -NotMatch eto ne kriti4eskaja oshibka pri audite? toest eto esli hash ne sootvetstvuet

Скрипт считает ошибки скачивания аудитов, но которые не содержат слово “open”. Это обычно context canceled и подобные. Это не критическая ошибка, однако если сателлит не сможет получить эту информацию ещё 3 раза, аудит будет считаться проваленным.

Powershell is not going so great atm :frowning:

Screenshot 2020-04-13 at 14.01.33

is there a way to split the log files? My storagenode.log is around 1.3GB and it’s only two months old. Takes forever to process that data.

Thinking about pushing the log to ELK

The powershell script grep’s the whole logfile. You can tail the logfiles to do it more quickly.

Get the last 1000 lines of the logfile:
$logfile = Get-Content storagenode.log -last 1000

2 Likes