I know, I know. This again.
One of my node seems to have stopped reliably populating the used bandwidth graph.
I know it’s wrong because the used disk space is going up reliably, there was a lot of bandwidth used during the testing and my success rate is above 98%.
I have done the usual database checks and they all came back OK.
I have restarted the node and let the file walker do its thing.
No unexpected errors in the logs as far as I could tell.
I know this is not a show stopper and everything seems to be working as usual but… it’s messing with my OCD!
Any ideas what next step could be to troubleshoot this?
I assume you use docker, because you do that a lot
Have you tried docker rm [node] before spinning it up again? That works for me for fluke errors from time to time. Yes yes, it’s just a rehash of the old “Did you try turning it off and on again”, but that tends to stick because it works.
Loads of free space and success rate is over 98% so the uploads are being successful…
The used space graph is going up, which suggests the node is successfully getting data.
It seems to coincide with the go-live of the bandwidth stat cache. It’s worth checking your db’s for errors too. Though I would expect related errors in your logs if that was the issue.