Dashboard doesn't load or loads slowly (both web and cli), bandwidthdb: database is locked all the time

2022-06-23T07:32:17.945Z INFO piecestore uploaded {“Process”: “storagenode”, “Piece ID”: “56RXJGQSZGHRIQG5WZTTL3C43QA6OJVKZY3UHJCBNAZZ3TVSKZ7A”, “Satellite ID”: “12EayRS2V1kEsWESU9QMRseFhdxYxKicsiFmxrsLZHeLUtdps3S”, “Action”: “PUT”, “Size”: 1280}

2022-06-23T07:32:18.419Z INFO piecestore uploaded {“Process”: “storagenode”, “Piece ID”: “P4EDXAJYAJ3VUNV3WXWOLLEPZOJINAGVH5ARAF6ND2536WG5BWJQ”, “Satellite ID”: “12EayRS2V1kEsWESU9QMRseFhdxYxKicsiFmxrsLZHeLUtdps3S”, “Action”: “PUT”, “Size”: 24320}

2022-06-23T07:32:18.972Z INFO piecestore downloaded {“Process”: “storagenode”, “Piece ID”: “V46I636EWPGRM3JXZTFQO42TAKE2LGHTUA74MJTOA4J6NKQUATIQ”, “Satellite ID”: “12L9ZFwhzVpuEKMUNUqkaTLGzwY9G24tbiigLiXpmZWKwmcNDDs”, “Action”: “GET”}

2022-06-23T07:32:19.344Z ERROR piecestore failed to add bandwidth usage {“Process”: “storagenode”, “error”: “bandwidthdb: database is locked”, “errorVerbose”: “bandwidthdb: database is locked\n\tstorj.io/storj/storagenode/storagenodedb.(*bandwidthDB).Add:60\n\tstorj.io/storj/storagenode/piecestore.(*Endpoint).beginSaveOrder.func1:723\n\tstorj.io/storj/storagenode/piecestore.(*Endpoint).Upload:349\n\tstorj.io/common/pb.DRPCPiecestoreDescription.Method.func1:220\n\tstorj.io/drpc/drpcmux.(*Mux).HandleRPC:33\n\tstorj.io/common/rpc/rpctracing.(*Handler).HandleRPC:58\n\tstorj.io/drpc/drpcserver.(*Server).handleRPC:122\n\tstorj.io/drpc/drpcserver.(*Server).ServeOne:66\n\tstorj.io/drpc/drpcserver.(*Server).Serve.func2:112\n\tstorj.io/drpc/drpcctx.(*Tracker).track:52”}

I use ‎HDTB420EK3AA, could it be just too slow for the job or is there another reason?
The PC itself is fast, and the network speed is excellent. Before passing audits with two Satelites I had 0 issues.

P.S. I did some sqlite “VACUUM;” on all db files but it didn’t help. The drive reads slow when it’s so busy, and is hot to touch I am planning to sort out a fan to cool it so maybe it will help.

It’s a SMR drive, thus - slow. See PSA: Beware of HDD manufacturers submarining SMR technology in HDD's without any public mention
The external drive is not a good solution anyway - they can detaching during the high load or be disconnected because overheat of USB controller. Without additional power supply they can disconnects too.
There is no good solution for SMR disks, they are known as bad for random writes (99% of all writes in storagenode). The workaround is to distribute the load by running more than one node on different drives in the same network.
Regarding database, you can move them to the SSD:

Hey man, this is a great reply! I will definitelly move the dbs, question, since I no longer can return the hdd, how could I adjust storagenode settings to maybe make it possible to still participate in the network?

Something like storage2.max-concurrent-requests, maybe?

This really is a last resort option as it can cause problems. I would try moving the databases first. The writes that happen there are the worst case for SMR drives as they constitute constant rewriting of the same files. Most other writes for a node are just writing new files, which if there is plenty of free space on the HDD isn’t nearly as much of a problem as rewriting the db files.

After moving the databases you may want to temporarily lower the assigned HDD space for the node to below what is filled to give the HDD time to settle any pending operations and empty CMR cache to the SMR parts of the drive. After that you can raise it again and chances are the drive will be fine from then on. If that doesn’t work, you can consider this option.

Edit: Before you use that option you could also try raising filestore.write-buffer-size to 4096 KiB, this will tell the node to buffer the entire piece in RAM. It should limit more smaller writes as well. There is a trade off with memory usage of course, but if the HDD can keep up that shouldn’t be a problem.

In some cases this may even hurt as it compacts the db’s which can make future updates slower on SMR HDD’s. It’s mostly meant to reclaim wasted space and can make searches in the db’s a little faster. It doesn’t really help for writes, which is the problem you are running into with SMR.

I got the ventilator and the drive is even cool now (it already had heatsink before), but yes the ssd holding dbs worked wonders, thanks a lot guys!!!

2 Likes