“ERROR piecestore failed to add bandwidth usage…bandwidth.db malformed bla bla bla” after normal update (apparently to v1.69.2).
unraid docker…i don’t need help…just letting people know.
I moved the DB’s, deleted the DB that was fucked up, started then stopped the node, then restored the rest back. Node is working again without errors (but im missing the past months info in the dashboard).
this was 100% unprovoked, other then i manually clicked the avalible “UPDATE” on the docker. Don’t worry i wont be updating again till i have hours of time to fix whatever random BS happens…just in case. I have turned auto update off, cuz if that would have happened without my knowledge, this whole node could have been lost.
This happened with an update of the storj docker. nothing else! not a reboot or incorrect shutdown etc…i ONLY clicked “update” (at that time, the server had been running for about a month since the last reboot).
it was ONLY the bandwidth.db, and i already (hopefully) corrected the problem, as it seems to be running ok now (following the second link in your reply). There are no longer any errors in the logs; although, i noticed this morning, one satellite has dropped its audit score to 99.85 % instead of 100%.
(and FYI i am running unraid V. 6.11.5). let me know if you need any other information to troubleshoot this issue!
it seems that way. yes, it’s the latest (stable) version. it was running fine (after the last time it failed after i updated the storj docker). I normally run a ballance and scrub about once a month as normal maintenance.
I make a habit of checking logs before/after any scheduled maint etc and noticed that not only was the storj node showing bandwidth.db errors in the log, it was offline in the dashboard…
I did the move -----> delete bad *.db shuffle (i cant be bothered to spend hours repairing it, if it has zero real world impact; other then stats), and its working again. I wish i knew how to figure out what was causing this, but unfortunately, i am not as smart as i wish i was.
Interesting, but my node is on a cache pool. I have just changed it to “only” instead of “prefer” (prefer allows new files to be moved to the main array, if the cache is full)…we’ll see if that makes a difference.
(Had the bandwidth.db error happen again last week, but I also was doing some unrelated testing on the server and might have caused it due to a hard shutdown. But I am getting sick of seeing/fixing this! thats for sure!)
i hope not. I am using old recycled drives for my storj pools, and a raid5 BTRFS (not recommended), so i take that responsibility as part of the potential problems! Apparently my NAS OS is adding ZFS cache soon…so we’ll see if that helps (or makes it worse), over the next few months!
I wonder if there is a way to build a .DB checker and repair into the storj platform? seems that, besides blatant user induced errors, db corruption is the most common issue!
I am also going to try to figure out how to move the .DB files off of the data array (to a stable, new drive, raid10 SSD cache). i think that might help alot with stability and stop the random “xyz.db locked” errors i seem to get from time to time on my pools. They are not the fastest HDD’s, I’m giving old random drives from retired PC’s, one last chance at life…
This is common and very frequent problem for UnRaid platform for a long time, I have to agree. However, I’m against of adding system tools to the storagenode software, it’s not a function of the storage node to fix database or filesystem issues, it belongs to OS. In the edge case you can start without databases at all and they will be recreated.
The storagenode software should be as light as possible to be run on weak devices like Raspberry Pi or even router.