auditScore on tardigrade decreased without errors

The audit score on tardigrade has dropped some minutes ago to 95%.

In parallel, there are NO errors, failed audits nor repair uploads or downloads within the last 24h.

How can that be?

auditScore 95% @ ...
.. audits (r: 0.00%, c: 0.00%, s: 100%)
.. downloads (c: 0.50%, f: 0.26%, s: 99%)
.. uploads (c: 0.23%, f: 0.01%, s: 100%)
.. rep down (c: 0.00%, f: 0.00%, s: 100%)
.. rep up (c: 0.00%, f: 0.00%, s: 100%)

Referencing the monitoring tool used.

From the logs directly:

pi@pi:~ $ LOGMINDATE=$(TZ=UTC date --date="1
440 minutes ago" +'%Y-%m-%dT%H:%M:%S.000Z')
pi@pi:~ $ cat /mnt/WD1003/logs/sn1.log | awk
 -v date="$LOGMINDATE" '$1 > date' | grep -E 'GET_AUD
IT' | grep 'failed' -c
pi@pi:~ $ cat /mnt/WD1003/logs/sn1.log | awk
 -v date="$LOGMINDATE" '$1 > date' | grep -E 'GET_REP
AIR' | grep 'failed' -c
pi@pi:~ $ cat /mnt/WD1003/logs/sn1.log | awk
 -v date="$LOGMINDATE" '$1 > date' | grep -E 'PUT_REP
AIR' | grep 'failed' -c

You need to measure time between audit request and when your node finished uploading this piece.
See examples:

The second case - if your node provided broken pieces, in the storagenode’s log the download will finish successfully without any error, but the audit score will be affected.

1 Like

Thank you, will analyze that.

Isn’t that some kind of a wormhole? Meaning: without ANY errors the node can be disqualified, if there’s no constant monitoring on the scores (besides the logs) and a fast reaction of the SNO, in case it’s being worse (ugh, my monitoring script works pretty well, I’ve recognized).

That’s really a mess, if it’s happening with a huge node and a lot of audit traffic. Because suspension will be skipped and it’s very fast disqualified.

This can be the case as well from my issues at the beginning of the year with the performance on MacOS, before I’ve switched back to RPi. Crossing fingers these are some minor hiccups over time as this one, IF that was the reason.

It would be nice if the satellite could provide some feedback (audit results), but it could help to malicious users too to improve their strategy…
However, you can create an issue to add such feedback to the storagenode’s logs.

1 Like

done :white_check_mark:

1 Like

Checked - that’s not the case: maximum time lag was 0.7 seconds.

So it must be this one.

Consequently I calm down, because:

Anyway, I’ve added the time lag analysis to the issue list of the monitoring script as well. At least I can create a routine which indicates time lags between audit requests, without the need to have it reported in the logs. :v:t2:

@Alexey I’ve added an automatic alert in version 1.9.0 in case time lags between audit downloads started and finished > 3 mins. Users will be notified automatically by mail and/or discord push message in order to quickly react and avoid a disqualification of their nodes. Will add it to the wiki page, too. :v:t2: