ERROR piecestore failed to add bandwidth usage {"error": "bandwidthdb: database disk image is malformed", "errorVerbose": "bandwidthdb: database disk image is malformed\n\tstorj.io/storj/storagenode/storagenodedb.(*bandwidthDB).Add:60\n\tstorj.io/storj/storagenode/piecestore.(*Endpoint).beginSaveOrder.func1:722\n\tstorj.io/storj/storagenode/piecestore.(*Endpoint).Upload:434\n\tstorj.io/common/pb.DRPCPiecestoreDescription.Method.func1:220\n\tstorj.io/drpc/drpcmux.(*Mux).HandleRPC:33\n\tstorj.io/common/rpc/rpctracing.(*Handler).HandleRPC:58\n\tstorj.io/drpc/drpcserver.(*Server).handleRPC:104\n\tstorj.io/drpc/drpcserver.(*Server).ServeOne:60\n\tstorj.io/drpc/drpcserver.(*Server).Serve.func2:97\n\tstorj.io/drpc/drpcctx.(*Tracker).track:52"}
sqlite3 storage/bandwith.db "PRAGMA integrity_check;"
ok
But the return is ok, so someone have a idea where the problem could be?
if you just migrated the node, shut it down and do rsync again without the --delete parameter, then when rsync is down you start it up again and the bandwidth db will work just fine…
This isn’t really the reason the files end up smaller as all that removes is lines that combine the inserts into a transaction. The resulting data is actually exactly the same without those lines.
It’s smaller because SQLite doesn’t really shrink the database size if data is removed. This will only happen if you vacuum the db file. For storj that really isn’t necessary as most databases stay small anyway. And the node software doesn’t vacuum the db’s. But since this repair method starts over with a clean db file and then inserts the still existing data back in, it removes all that no longer necessary data as well.
Tl;Dr: the smaller file is perfectly fine.
One additional note. This db stores expiration dates for piece data on your node and the data in it is technically nonessential. If data is missing your node won’t remove pieces when they expire immediately. But garbage collection processes that run frequently on your node will still figure out that the files are no longer needed and the cleanup will still happen relatively quickly after expiration anyway. So even if you don’t trust my explanation above, your node is going to be fine even if data is missing from that db.
My storage node is working again without any errors!
First I had to “repair” the piece_expiration.db.
After that, I tried to restart the node, and get again the errors, but this time the bandwith.db had problems.
Again the same workflow for this db file, now all is working fine.
It worries me that you’re having problems again this quickly. You had previously checked that file and it was fine.
So a question. Are you running your storage location over a network share like SMB or NFS? If so, stop. Sqlite doesn’t support this and you will keep having corrupted files. The only network protocol that doesn’t suffer from this is iSCSI.
If that’s not it, prevent interrupting the node abruptly. When stopping the docker container, always include -t 300 to ensure it has enough time to stop gracefully. Stop the node before restarting the system and never do a hard restart without shutting the system down first. If that last part has happened, please also check the integrity of your file system (fsck) to ensure that is not causing the corruption issues.