Failed audits in Dashboard, but not in logs

Pac · December 11, 2020, 8:26am

Hi,

I’m wondering something as one of my nodes failed an audit at some point (noticed only now the score is 99.xx%):

"12rfG3sh9NCWiX3ivPjq2HtdLmbqCrvHVEzJubnzFzosMuawymB"
{
  "totalCount": 872,
  "successCount": 871,
  "alpha": 19.975104,
  "beta": 0.024895,
  "unknownAlpha": 19.99999,
  "unknownBeta": 0,
  "score": 0.9987552499377624,
  "unknownScore": 1
}

I scrapped my logs from the past ~70 days (october, november and december logs) and found not a single audit failure with the following command:
cat node.log | grep -i audit | grep -i fail

But this node was previously on a very slow SMR drive that was crawling miserably for hours (more than 10…) whenever watchtower would update it, making it slow to respond, so my best guess right now is that it might have failed an audit because of the 5 minute timeout in the past.

When an audit fails because of the timeout, is this audit failure supposed to be visible in the Node’s logs?

Alexey · December 11, 2020, 8:42am

There are two ways to fail audit:

corrupted/missed/unreadable file;
3 timeouts 5 minutes each for the same piece.

You can see the audit failed only for the first case. For the second case you could see only started audit but never finished. If the node was unresponsive or too slow or almost hang (but was able to answer on audit request), it might not even register the attempt, especially if that the same disk which the node is unable to read/write in a reasonable time.

Pac · December 11, 2020, 9:05am

Right, so I would need to parse the logs to (maybe) find started audits that do not have corresponding success/failure lines.
That would require some scripting
Thanks @Alexey.

andrew2.hart · December 11, 2020, 4:38pm

truncated files only show a error at the satellite end, nothing in the storagenode logs.
At least in my testing several versions ago.

or did i get it backwards lol