Failed to add bandwith usage

Hi I am getting a lot of “failed to add bandwith usage” errors.
This file ist almost 8GB big. A lot for a database.
Tried to rename it but my node does not start with a new database (keeps restarting).

2023-10-01T15:53:39+02:00       ERROR   piecestore      failed to add bandwidth usage   {"process": "storagenode", "error": "bandwidthdb: database is locked", "errorVerbose": "bandwidthdb: database is locked\n\tstorj.io/storj/storagenode/storagenodedb.(*bandwidthDB).Add:60\n\tstorj.io/storj/storagenode/piecestore.(*Endpoint).beginSaveOrder.func1:882\n\tstorj.io/storj/storagenode/piecestore.(*Endpoint).Download.func7.1:751\n\tstorj.io/storj/storagenode/piecestore.(*Endpoint).Download.func7:789\n\tstorj.io/storj/storagenode/piecestore.(*Endpoint).Download:806\n\tstorj.io/common/pb.DRPCPiecestoreDescription.Method.func2:251\n\tstorj.io/drpc/drpcmux.(*Mux).HandleRPC:33\n\tstorj.io/common/rpc/rpctracing.(*Handler).HandleRPC:61\n\tstorj.io/common/experiment.(*Handler).HandleRPC:42\n\tstorj.io/drpc/drpcserver.(*Server).handleRPC:124\n\tstorj.io/drpc/drpcserver.(*Server).ServeOne:66\n\tstorj.io/drpc/drpcserver.(*Server).Serve.func2:114\n\tstorj.io/drpc/drpcctx.(*Tracker).track:35"}

Stop node, check filesystem, then follow this to check, repair, or recreate affected databases https://support.storj.io/hc/en-us/articles/4403032417044-How-to-fix-database-file-is-not-a-database-error, then make changes to your setup to prevent corrupting it in the future.

and if its docker remove container:

1 Like

As I can see, the database is not corrupted, it’s locked, meaning your disk subsystem is slow.
If you would follow mentioned suggestions, you will lose the bandwidth statistic, include past periods.
If you are ok to lose the statistics, go ahead. But if you want to preserve it, then you can try to dump data from this database and load it to a new database:

The other way is to move databases to SSD:

1 Like

Is this really due to the hard drive or could it also be because the RAM usage is significantly higher? I locked the max RAM to 3GB and yesterday it was almost full. Now with 6GB I dont see the error at RAM usage about 2,5GB.

please tell about hardware, disk type/number , filesystem etc.

could be

1 Like

OS Unraid 6.12.3
Ryzen 5 5600G, 32GB RAM
WDC_WD120EMFZ 12TB WD White Label connected with SATA 6G (WD My Book or Elements)
Filesystem XFS, 6-7TB used

Node RAM usage between 800MB and 3GB (Docker limit 6GB)

maybe it can’t get up because of fragmentation. 6-7TB is an critical point. in ntfs also.
makes all sense to me. check fragmentation, try defragmentation.
i dont know how to do it in unraid xfs, but there seems to be possibilities.

Node buffers data when disk can’t keep up with IO. Limiting node ram is pointless. It will get killed. You need to improve your filesystem performance. There is a lot of discussion on this forum on how to do that. Start with increasing amount of ram available for the filesystem cache; you want metadata to be entirely in ram, this will remove a lot of IO from the disk.

1 Like

High RAM usage is the symptom, not root cause here. Slow disk I/O will result in both locked database and high RAM usage. Make your disk I/O faster and both symptoms will go away.

1 Like

I am not sure what to do about Disk I/O speeds. The disks only runs for Storj and are connected via SATA6.

Sorry, I am unfamiliar with Unraid. You need to find the bottleneck in your setup. Given that this is supposedly a Linux-based system, you should probably start diagnosis from looking at the output of iostat as described here.

1 Like

did you defragment it? this will improve disk speed.

also you can move databases to ssd? is the system on the same drive like the node?

its verry likely the wd-drive. white label is usualy shrugged from the encasing.

1 Like

Do not use XFS, it is very slow on deletes and the drive will be struggling on each GC run. It will be even worse as the node will grow.
I made the same mistake and I’m currently migrating nodes from XFS to ext4 as GC takes much more time on XFS than it takes on ext4. Not very happy as it is very time consuming, but with XFS when GC was running I had very low success rates, as low as 30% on server grade drives.

1 Like

That’s interesting, I didn’t know, that xfs is slower than ext4, I though they are similar regarding i/o, but xfs have some advantages (not used by storagenode though).
Thanks

@Toyoo
I dont get the database error anymore, but my RAM usage is really high. Like 6GB sometimes.

@daki82
Unraid OS runs in RAM and starts from a USB.
The Disk only runs the node. Everything else (Docker) is on fast M.2 or RAM.

I did not know I can defragment XFS. This will probably take a long time and I will probably have to stop the node for this?

Moving to EXT4 will probably take a long time, right? In the past I moved a 1TB Node to another Disk and it took over 1 day. How does it compare to NTFS? NTFS would be a lot easier to handle.

Please NEVER use NTFS under Linux (UnRaid uses a Linux OS), things become even worse than with XFS. It will be corrupted and to fix issues you will be forced to connect this drive to Windows to fix them.

1 Like

Fragmentation looks not to bad or am I wrong?
image

@Alexey Cant confirm. I am running 1 NTFS node with 7TB with 0 problems and less “race lost” errors compared to XFS. The node is from the start of storj v3.

Sorry, I don’t know Unraid, I cannot help you more than just suggesting you need to find what is bottlenecking your I/O.

In short, nobody can say if its bad. I have a similar drive wich also did stutter at the same filling. Of ~7tb. But with ntfs.

If the drive is alright nothing bad can happen if you run defrag for a week while node is running.
But i set it via yaml to be full. To reduce stress.
Also moved the dbs to an flashdrive.

With windows it was no problem.

Maybe we both have slow slug drives.