I’ve been monitoring the problematic node and now it is constantly resetting / restarting the used space value in the dashboard…
checking the recent log entires - it shows a lot of these errors…
I have restarted the node - but in that time - it keeps dropping that used pace to zero then tries to build it up… currently showing 29.07MB after 1h23m uptime…
Any help appreciated (note - only this node - others are fine)
These errors are a consequence of not updated databases.
If the storagenode service/container is crashing, you need to figure out why and fix the issue.
Search for FATAL and Unrecoverable errors in your logs, they may explain the reason for the crash.
You responded “not necessarily” to my question for posting the logs.
This is what I found when I searched the log for FATAL. I do not know how to output from the log anything with FATAL, so that was just using the find feature and I copied the log entry line.
2024-07-16T20:05:47-04:00 FATAL Unrecoverable error {"error": "piecestore monitor: timed out after 1m0s while verifying writability of storage directory", "errorVerbose": "piecestore monitor: timed out after 1m0s while verifying writability of storage directory\n\tstorj.io/storj/storagenode/monitor.(*Service).Run.func2.1:175\n\tstorj.io/common/sync2.(*Cycle).Run:160\n\tstorj.io/storj/storagenode/monitor.(*Service).Run.func2:164\n\tgolang.org/x/sync/errgroup.(*Group).Go.func1:78"}
This error requires to optimize the filesystem, if you already did so, but still get this error, seems your disk subsystem just slow. You may increase a timeout for this check to do not crash your node, with a some drawbacks, of course:
You could increase it by 30s seconds until the node would not stop to crash.
However, if you would be forced to increase it more than 5m0s (by the way, please check, what exact check it’s failed - the writeable or a readable?), then you likely have a bigger problem with this disk, maybe it’s dying. And also you would need to increase a writeable check interval too (it’s 5m0s by default).
Since you increasing it on 30s, to have 1m30s, then you do not need to increase the writeable check interval (it’s 5m0s by default, which is less than 1m30s).