Error: bandwidthdb error: disk I/O error: bad file descriptor

abc · September 2, 2020, 4:05pm

Hello there,
my node was offline for such a long time (>1 Month) and when i restarted it on 30.08 the first thing i did was to Update to 1.10.1 and now the Node runs. The suspensions are gone (except saltlake) but now more and more Errors come:
First error:
Error: bandwidthdb error: disk I/O error: bad file descriptor
–>This Error is shown very offten if i monitore the node over the CLI-Dashboard

2020-09-02T16:01:45.400Z ERROR piecestore failed to add order {“error”: “ordersdb error: database is locked”, “errorVerbose”: “ordersdb error: database is locked\n\tstorj.io/storj/storagenode/storagenodedb.(*ordersDB).Enqueue:53\n\tstorj.io/storj/storagenode/piecestore.(*Endpoint).saveOrder:665\n\tstorj.io/storj/storagenode/piecestore.(*Endpoint).Upload:409\n\tstorj.io/common/pb.DRPCPiecestoreDescription.Method.func1:996\n\tstorj.io/drpc/drpcmux.(*Mux).HandleRPC:107\n\tstorj.io/common/rpc/rpctracing.(*Handler).HandleRPC:56\n\tstorj.io/drpc/drpcserver.(*Server).handleRPC:111\n\tstorj.io/drpc/drpcserver.(*Server).ServeOne:62\n\tstorj.io/drpc/drpcserver.(*Server).Serve.func2:99\n\tstorj.io/drpc/drpcctx.(*Tracker).track:51”}
2020-09-02T16:01:45.400Z INFO piecestore uploaded {“Piece ID”: “XQ7KAEFWXNBI43ZDM3MXKS6QMRNOXFF7O55WTWFQ4P2RGZMWYXYA”, “Satellite ID”: “12EayRS2V1kEsWESU9QMRseFhdxYxKicsiFmxrsLZHeLUtdps3S”, “Action”: “PUT”}
–> Such as this Errors come all the time and after such an Error no Data will be Stored etc.
–> After this Error Bandwith egress stoped at 11.1MB and Bw-Ingress at 174.2MB
–> Also the used Disk space didn´t change any more
–>After Restart the node works just fine until such error comes again

As i just saw the node works just fine until i open the CLI-Dashboard or the WEB-Interface.
(or it just was at the same moment that the error came)

So to min. the Off-Time of the Node my temp. Solution is to automaticaly let restart the node all 12h

Sorry for my bad Englisch

Hoppe you can help me?

nerdatwork · September 2, 2020, 4:41pm

Check your disk for errors. Also how is your HDD connected ?

abc · September 2, 2020, 4:43pm

To be as clear as pos.:
My HDD is Connected to a “SATA-to-USB-Converter” and this Converter is connected directly via USB to the Server.

If you need it, my Server runds Debian 10 with all Updates etc.

nerdatwork · September 2, 2020, 5:02pm

It could be possible the connection is not stable hence the disk I/O error. Please check disk for errors too.

abc · September 2, 2020, 5:57pm

I did but the disk works just fine and in my opinion i have a 4TB Drive 3TB for Storj and 1TB for my own stuff and i never had Problems using it

Toyoo · September 2, 2020, 6:53pm

I had a “disk working just fine”, except that I noticed after few months that data is silently corrupted somewhere. I only noticed it because I manually checksummed some important files. This somewhere turned out to be a bad USB cable, when I started testing each component separately. And it was just a bit or two per gigabyte of data—not enough to casually spot it, but definitely enough to break most large files. Make sure your infrastructure is ok.

abc · September 3, 2020, 4:18am

I will check on that an will give you a report.
Maybe thats the solution

abc · September 5, 2020, 8:08am

So i checked that and checksummed a 10GB and a 100GB file and copied the files multible times (the 10GB over 50 times the 100GB file over 20 times) and checksummed the copy and each time the checksumm’s were identical.