Success rate script - Now updated for new delete terminology since v1.29.3

Hello guys,

my successrate.sh says some very small percentages in Upload Success and also in Repair Upload Success. By what is this caused?

Well, partially it’s because of an issue with completed transfers being logged as canceled. The rest is other nodes finishing the transfer faster than yours does. That’s usually related to latency. Either in disk writes or your connection. Use internal HDD’s, don’t use network connected storage. Internet latency is usually tied to location and there isn’t much you can do about that.

1 Like

A post was split to a new topic: Audits success script by satellites

I can’t in any way get the script to work. I gave Power shell unrestricted enablement. I did the unlock file unblock-file but when I start it it shows me this result that see below in the image. I tried it on three different windows 10 nodes but the same also with different windows updates. Can you tell me where am I wrong?

Please, clone the repo or download a raw file as a script.

thanks a lot, i resolved the problem. i see on other scripts the table " DELETE" . In my script is not visible. I have old script?

The linux shell script is maintained by me, but the powershell script is maintained by Alexey, they may not be completely in sync. Delete activity is not all that interesting, but I added it later because someone requested it. You’re not missing out on something important.

@Mark pointed out a mistake with the failed delete detection. Thanks for reporting!

This has now been fixed. The new version can be downloaded from the link in the top post.

1 Like

Hi guys!
And thanx for all the fish!* :wink:

Just wanted to suggest a minor tweak to the first post in this thread.
Could you edit it and add this line or something for newbies?

UPLOAD means the ingress, and DOWNLOAD means the egress traffic

Cheers! and again Thanx for your hard work put in those scripts. :smiley:

*Hitchhiker’s’ guide to galaxy (about the fish joke)

Haha, I got the reference. So much so that I thought you were saying goodbye.

Anyway, I added a line about terminology at the bottom.

1 Like

it’s kinda scary how fast one gets use to turning it on its head tho… i don’t even think about it any more, it’s just how it is lol but yeah the logs are kinda confusing to look at in the beginning.

i should run a successrate.sh it’s been to long lol

========== AUDIT ==============
Critically failed:     0
Critical Fail Rate:    0.000%
Recoverable failed:    0
Recoverable Fail Rate: 0.000%
Successful:            2117
Success Rate:          100.000%
========== DOWNLOAD ===========
Failed:                0
Fail Rate:             0.000%
Canceled:              556
Cancel Rate:           2.076%
Successful:            26220
Success Rate:          97.924%
========== UPLOAD =============
Rejected:              0
Acceptance Rate:       100.000%
---------- accepted -----------
Failed:                2
Fail Rate:             0.013%
Canceled:              6154
Cancel Rate:           40.661%
Successful:            8979
Success Rate:          59.326%
========== REPAIR DOWNLOAD ====
Failed:                0
Fail Rate:             0.000%
Canceled:              0
Cancel Rate:           0.000%
Successful:            10995
Success Rate:          100.000%
========== REPAIR UPLOAD ======
Failed:                0
Fail Rate:             0.000%
Canceled:              5113
Cancel Rate:           40.425%
Successful:            7535
Success Rate:          59.575%
========== DELETE =============
Failed:                0
Fail Rate:             0.000%
Successful:            4904
Success Rate:          100.000%

but when stuff looks like this, its difficult to be bothered… lol
this was for 28th

1 Like

At this point you might as well wait for 1.9.5 which will include the fix for upload canceled.

And yeah, this terminology becomes second nature quickly, but I can see why it’s confusing for new SNOs.

1 Like

i got all the logs by day, so easy to just do a quick check, a feature that might be cool next time you are upgrading the script… would be an option to do sequential logs in a row … maybe

so if they are sorted by date then one could write something to it check over a week or a month or a year… but i suppose it’s not really a terrible useful thing…
just thought of it…

yeah i really look forward to 1.9.5, will be very interesting to see my actual successrate… ofc this means that now is the last chance to optimize with the old successrate… it kinda makes me wonder why exactly i can go from 65 to 85% in consistent successrates… even if it wasn’t the actual successrates, then it was something inside my server that worked faster to give more accurate results… i wouldn’t mind having figured out exactly what that was i had tinkered with…

Since data is just passed to cat, you can change to

LOG_SOURCE="$*"

And pass multiple filenames to the script. Or use wildcards.
Haven’t tested it, but it should work.

2 Likes

A post was split to a new topic: The term ‘. \ successrate.ps1’ is not recognized as a cmdlet name, function, script file or operable program

@BrightSilence is there an update planned for this since it appears the “deletes” have new terminology?

Ah yes, forgot that I should have changed that. It’s fixed now. It’s a temporary change from what I gathered, so it now catches both old and new terminology.

4 Likes

The original successrate.ps1 script from @Alexey was having problems on my machine with high memory usage whilst running on larger log files. So I rewrote it slightly to read the log one line at a time, calculate all the variables as it goes and then print out at the end. It’s between 10% and 30% faster to run on my machine using an identical log file.

param(
    $Path = "C:\Program Files\Storj\Storage Node\storagenode - Copy.log"
)

$auditsSuccess=0
$auditsFailed=0
$auditsFailedCritical=0

$dl_success=0
$dl_canceled=0
$dl_failed=0

$put_success=0
$put_rejected=0
$put_canceled=0
$put_failed=0

$get_repair_success=0
$get_repair_canceled=0
$get_repair_failed=0

$put_repair_success=0
$put_repair_canceled=0
$put_repair_failed=0


[System.IO.File]::ReadLines($Path) | ForEach-Object {
if ($_ -match "GET_AUDIT" -And $_ -match "downloaded"){$auditsSuccess+=1}
if ($_ -match "GET_AUDIT" -And $_ -match "failed" -And $_ -notmatch "open"){$auditsFailed+=1}
if ($_ -match "GET_AUDIT" -And $_ -match "failed" -And $_ -match "open"){$auditsFailedCritical+=1}

if ($_ -match '"GET"' -And $_ -match "downloaded"){$dl_success+=1}
if ($_ -match '"GET"' -And $_ -match "canceled"){$dl_canceled+=1}
if ($_ -match '"GET"' -And $_ -match "failed"){$dl_failed+=1}

if ($_ -match '"PUT"' -And $_ -match "uploaded"){$put_success+=1}
if ($_ -match '"PUT"' -And $_ -match "rejected"){$put_rejected+=1}
if ($_ -match '"PUT"' -And $_ -match "canceled"){$put_canceled+=1}
if ($_ -match '"PUT"' -And $_ -match "failed"){$put_failed+=1}

if ($_ -match "GET_REPAIR" -And $_ -match "downloaded"){$get_repair_success+=1}
if ($_ -match "GET_REPAIR" -And $_ -match "canceled"){$get_repair_canceled+=1}
if ($_ -match "GET_REPAIR" -And $_ -match "failed"){$get_repair_failed+=1}

if ($_ -match "PUT_REPAIR" -And $_ -match "uploaded"){$put_repair_success+=1}
if ($_ -match "PUT_REPAIR" -And $_ -match "canceled"){$put_repair_canceled+=1}
if ($_ -match "PUT_REPAIR" -And $_ -match "failed"){$put_repair_failed+=1}
}

Write-Host "========== AUDIT ============="  -ForegroundColor Cyan

if (($auditsSuccess + $auditsFailed + $auditsFailedCritical) -ge 1) {
    $audits_failed_ratio = $auditsFailed / ($auditsSuccess + $auditsFailed + $auditsFailedCritical) * 100
    $audits_critical_ratio = $auditsFailedCritical / ($auditsSuccess + $auditsFailed + $auditsFailedCritical) * 100
    $audits_success_ratio = $auditsSuccess / ($auditsSuccess + $auditsFailed + $auditsFailedCritical) * 100
} else {
    $audits_failed_ratio = 0.00
    $audits_critical_ratio = 0.00
    $audits_success_ratio = 0.00
}

Write-Host ("Critically failed:`t" + $auditsFailedCritical) -ForegroundColor Red
Write-Host ("Critical Fail Rate:`t{0:N}%" -f $audits_critical_ratio)
Write-Host ("Recoverable failed:`t" + $auditsFailed) -ForegroundColor Yellow
Write-Host ("Recoverable Fail Rate:`t{0:N}%" -f $audits_failed_ratio)
Write-Host ("Successful:`t`t" + $auditsSuccess) -ForegroundColor Green
Write-Host ("Success Rate:`t`t{0:N}%" -f $audits_success_ratio)

Write-Host "========== DOWNLOAD =========="  -ForegroundColor Cyan


if (($dl_success + $dl_failed + $dl_canceled) -ge 1) {
    $dl_failed_ratio = $dl_failed / ($dl_success + $dl_failed + $dl_canceled) * 100
    $dl_canceled_ratio = $dl_canceled / ($dl_success + $dl_failed + $dl_canceled) * 100
    $dl_ratio = $dl_success / ($dl_success + $dl_failed + $dl_canceled) * 100
} else {
    $dl_failed_ratio = 0.00
    $dl_canceled_ratio = 0.00
    $dl_ratio = 0.00
}

Write-Host ("Failed:`t`t`t" + $dl_failed) -ForegroundColor Red
Write-Host ("Fail Rate:`t`t{0:N}%" -f $dl_failed_ratio)
Write-Host ("Canceled:`t`t" + $dl_canceled) -ForegroundColor Yellow
Write-Host ("Cancel Rate:`t`t{0:N}%" -f $dl_canceled_ratio)
Write-Host ("Successful:`t`t" + $dl_success) -ForegroundColor Green
Write-Host ("Success Rate:`t`t{0:N}%" -f $dl_ratio)

Write-Host "========== UPLOAD ============"  -ForegroundColor Cyan


if (($put_success + $put_rejected + $put_failed + $put_canceled) -ge 1) {
    $put_failed_ratio = $put_failed / ($put_success + $put_rejected + $put_failed + $put_canceled) * 100
    $put_canceled_ratio = $put_canceled / ($put_success + $put_rejected + $put_failed + $put_canceled) * 100
    $put_accept_ratio = ($put_success + $put_canceled + $put_failed) / ($put_success + $put_rejected + $put_failed + $put_canceled) * 100
    $put_ratio = $put_success / ($put_success + $put_rejected + $put_failed + $put_canceled) * 100
} else {
    $put_failed_ratio = 0.00
    $put_canceled_ratio = 0.00
    $put_accept_ratio = 0.00
    $put_ratio = 0.00
}

Write-Host ("Rejected:`t`t" + $put_rejected)
Write-Host ("Acceptance Rate:`t{0:N}%" -f $put_accept_ratio)
"---------- accepted ----------"
Write-Host ("Failed:`t`t`t" + $put_failed) -ForegroundColor Red
Write-Host ("Fail Rate:`t`t{0:N}%" -f $put_failed_ratio)
Write-Host ("Canceled:`t`t" + $put_canceled) -ForegroundColor Yellow
Write-Host ("Cancel Rate:`t`t{0:N}%" -f $put_canceled_ratio)
Write-Host ("Successful:`t`t" + $put_success) -ForegroundColor Green
Write-Host ("Success Rate:`t`t{0:N}%" -f $put_ratio)

Write-Host "========== REPAIR DOWNLOAD ==="  -ForegroundColor Cyan

if (($get_repair_success + $get_repair_failed + $get_repair_canceled) -ge 1) {
    $get_repair_failed_ratio = $get_repair_failed / ($get_repair_success + $get_repair_failed + $get_repair_canceled) * 100
    $get_repair_canceled_ratio = $get_repair_canceled / ($get_repair_success + $get_repair_failed + $get_repair_canceled) * 100
    $get_repair_ratio = $get_repair_success / ($get_repair_success + $get_repair_failed + $get_repair_canceled) * 100
} else {
    $get_repair_failed_ratio = 0.00
    $get_repair_canceled_ratio = 0.00
    $get_repair_ratio = 0.00
}

Write-Host ("Failed:`t`t`t" + $get_repair_failed) -ForegroundColor Red
Write-Host ("Fail Rate:`t`t{0:N}%" -f $get_repair_failed_ratio)
Write-Host ("Canceled:`t`t" + $get_repair_canceled) -ForegroundColor Yellow
Write-Host ("Cancel Rate:`t`t{0:N}%" -f $get_repair_canceled_ratio)
Write-Host ("Successful:`t`t" + $get_repair_success) -ForegroundColor Green
Write-Host ("Success Rate:`t`t{0:N}%" -f $get_repair_ratio)

Write-Host "========== REPAIR UPLOAD ====="  -ForegroundColor Cyan


if (($put_repair_success + $put_repair_failed + $put_repair_canceled) -ge 1) {
    $put_repair_failed_ratio = $put_repair_failed / ($put_repair_success + $put_repair_failed + $put_repair_canceled) * 100
    $put_repair_canceled_ratio = $put_repair_canceled / ($put_repair_success + $put_repair_failed + $put_repair_canceled) * 100
    $put_repair_ratio = $put_repair_success / ($put_repair_success + $put_repair_failed + $put_repair_canceled) * 100
} else {
    $put_repair_failed_ratio = 0.00
    $put_repair_canceled_ratio = 0.00
    $put_repair_ratio = 0.00
}

Write-Host ("Failed:`t`t`t" + $put_repair_failed) -ForegroundColor Red
Write-Host ("Fail Rate:`t`t{0:N}%" -f $put_repair_failed_ratio)
Write-Host ("Canceled:`t`t" + $put_repair_canceled) -ForegroundColor Yellow
Write-Host ("Cancel Rate:`t`t{0:N}%" -f $put_repair_canceled_ratio)
Write-Host ("Successful:`t`t" + $put_repair_success) -ForegroundColor Green
Write-Host ("Success Rate:`t`t{0:N}%" -f $put_repair_ratio)
4 Likes

Hi @BrightSilence , hi all,

I want to proper handle this issue in the script resp. in a copy of your script.

Currently audits are selected like this:

audit_failed_warn=$(echo "$LOG1D" 2>&1 | grep -E 'GET_AUDIT|GET_REPAIR' | grep failed | grep -v -e exist -c)
audit_failed_crit=$(echo "$LOG1D" 2>&1 | grep -E 'GET_AUDIT|GET_REPAIR' | grep failed | grep exist -c)

The issue is, that the new “audit failure” contains both, “exist” AND “failed”, but it is more a “warning” instead of a “critical” error. Example:

2022-04-03T10:14:55.159Z        ERROR   piecestore      download failed {"Piece ID": "32B7EYRW5CFKH3MHXQSPIX7XSE2IJK7H7TF3OJCVTY6DXBNNCJ7A", "Satellite ID": "12L9ZFwhzVpuEKMUNUqkaTLGzwY9G24tbiigLiXpmZWKwmcNDDs", "Action": "GET_REPAIR", "error": "used serial already exists in store", "errorVerbose": "used serial already exists in store

Do you have an idea how to properly mix the grep commands to properly categorise it? Something like:

WARN = grep "failed" | grep [not "exist"] OR ["exist" AND "used serial"]
CRIT = grep "failed" | grep ["exist"] AND NOT ["used serial"]

First, well, no easy way to do OR unless you use tools that might not exist on some environments (like bash that is modern enough to understand the >() syntax), but it’s easy to do two runs over the file:

grep failed | grep -v exist
grep failed | grep exist | grep 'used serial'

Alternatively you can employ awk or perl, so… ¯\(ツ)

Second:

grep failed | grep exist | grep -v 'used serial'
1 Like