Today I got some “node suspended” emails for 3 satellites and the dashboard shows that I’m suspended on 5 satellites, but if I search for “GET_AUDIT” and “failed” the last errors are from 2020-06-24. However, I got some “database locked” errors today, but no failed audits. (Update below)

When I tried to stop the node to check the db’s, the hard disk with the node’s data seemed to have locked up completely, I couldn’t even write an empty file to it. Some db-wal files were huge. I remember the usedserial.db was about 90MB, but the usedserial.db-wal was about 1GB. Even though top and iotop didn’t show anything unusual, the system couldn’t cope with the db’s being this big it looks like.

So I had to reboot the server and after that I checked and vacuumed the database. There were no errors.

Update: Just went through the logs and noticed that there are almost 6h of logs missing. I run the dashboard and “tail -f node.log” in a tmux session and there was definitely output from the log around that time. I guess it couldn’t really be written to disk because of the lockup.

Since then there were successful audits on all satellites and data is uploaded again, but the web dashboard still says suspended on 5 satellites. How long will the suspension show after the problem was solved? I would hate to lose this node 2 days before it completes the 15 months…

Hi. For me, the message disappeared after about 2-3 hours.

You need to complete enough audits to get the hidden unknown audit score back above 0.6. I’ll be adding this score to the earnings calculator once my node gets updated to v1.6.4. I’ve been wanting to add this ever since the score was announced to be added to 1.6.3, but the delayed rollout due to issues delayed that addition as well.

Thanks to both of you, the node is out of suspension on all satellites now.

