Tuning audit scoring

yeah, there is always that. But the question is whether nodes should be allowed to recover from issues we haven’t thought of yet. Any default behavior for unknown scenarios is likely to be wrong as much as it is right. I think if we keep notifying about issues and Storj Labs keeps tackling them, we can get rid of the most important ones.
I do like your suggestions of just monitoring consecutive audit failures though. Though I would pick a number higher than 2. Since it is allowed for 2% of data to be lost, with every failure you have a 2% chance to fail the next one too. That’ll trigger too many false positives.

One failed audit - problem. Two in a row - big problem (how many failed audits in a row = instant DQ? 40?). I guess this could be a configurable parameter.
However, this could also be used for regular GET requests, but with slightly relaxed requirements.

The timeout one is more serious and the node should probably shut down after two in a row. This would probably be more difficult to implement though. Since the thread trying to access the disk would just freeze, some other thread should monitor for reads that are taking too long (keeping everything in memory, since disk access is frozen) and then shutting the node down or at least disabling the handling of requests when it detects two or more reads that are frozen.

While I agree with that for the most part. If 2% data loss is allowed, any safe guards would need to take that into account and not kill the node too fast. At least at default setting.

Currently it takes only 10 consecutive audit failures to DQ. After suggested changes it would be about 40.

It just stops the node, the operator can restart it if he finds no problem. It would be bad to shut down the node with 3 failed audits left until DQ, the shutdown should be faster, so that the operator can try to fix the problem a few times.

yeah i think it should be based on audit score % in 10% increments.
then it would stop at 90,80,70 and then DQ at 60%
so would give 3 tries to fix it… which would be kinda nice.