Ordersdb error: database disk image is malformed

Cross91 · November 20, 2019, 5:49am

Hello together,
I just came accross this warning in my log files:

So looks my CLI dashboard:

Yesterday evening, everything worked just fine.
Which info can I provide to solv this problem?

nerdatwork · November 20, 2019, 6:21am

Try this:

Cross91 · November 20, 2019, 8:58am

Tried that with my first node (because of that problem I needed to start a new node).
To the background:
My first node ran as well on my unraid 6.7.2. Unfortunatly I made an update for the container from the ui and since then the database(s) got malformed (after that I read from that manual shutdown with t-300). So I tried exactly this procedure, but failed at Point 5.2 because unraid doesn’t know this prompt (if I remember right).
What I have to say is, that I am not a specialist in Linux OS.
I am more an user and am happy if something just works.
What I have to say:
So why I like to support Storj is, that I have a fiberoptic connection, a running unraid server, nearly no downtime (UPS), a static ip and unused storage. So it wouldn’t affect me, if I share storage.
But what affects me is, that things don’t run propperly and I dont have the abbility to solve that problem.

So in order to not loose my node again (and maybe Storj for ever), maybe a specialist could have a look on my machine over TeamViewer. Because I think that my requirements for a storage node are too good to just throw away.
But you decide and I like to help how much I can.

Alexey · November 20, 2019, 9:07am

Unfortunately, the problem is in the Unraid platform, your databases will be corrupted again and again, until you downgrade the Unraid to v6.6.7 or moved from it to any other system.

About timeout option it should be -t 300, not t-300. For example, to graceful stop the container:

docker stop -t 300 storagenode

In case of Unraid you should use their package manager, the apt only for Debian-based. So the more simple way is to use a docker-way, which described in the same article.

Cross91 · November 20, 2019, 9:15am

Yeah, sorry with that. I meant -t 300

Thats what I made directly after installing the new node:

Ok, so thats really sorry to hear. Maybe in the future it runs with v6.7.2.
I think that I will stop as a unraid node operator since then.

Is there a description how to exit without destroing all?

heunland · November 20, 2019, 4:09pm

Graceful exit is not yet fully implemented, so either wait until it is (hopefully pretty soon) or migrate your node to a different disk/computer https://documentation.storj.io/resources/frequently-asked-questions#how-do-i-migrate-my-node-to-a-new-drive-or-computer

Gank · February 26, 2020, 6:47pm

Got a new error on the same node that was giving the test_table problem.

2020-02-26T17:50:17.024Z INFO bandwidth Performing bandwidth usage rollups
2020-02-26T17:50:17.440Z INFO version running on version v0.33.4
2020-02-26T17:50:39.526Z INFO orders.118UWpMCHzs6CvSgWd9BfFVjw5K9pZbJjkfZJexMtSkmKxvvAW sending {“count”: 2}
2020-02-26T17:50:39.526Z INFO orders.12L9ZFwhzVpuEKMUNUqkaTLGzwY9G24tbiigLiXpmZWKwmcNDDs sending {“count”: 74}
2020-02-26T17:50:39.526Z INFO orders.12EayRS2V1kEsWESU9QMRseFhdxYxKicsiFmxrsLZHeLUtdps3S sending {“count”: 3}
2020-02-26T17:50:39.526Z INFO orders.121RTSDpyNZVcEU84Ticf2L1ntiuUimbWgfATz21tuvgk3vzoA6 sending {“count”: 1}
2020-02-26T17:50:39.526Z INFO orders.1wFTAgs9DP5RSnCqKV1eLf6N9wtk4EAtmN5DpSxcs8EjT69tGE sending {“count”: 4}
2020-02-26T17:50:39.701Z INFO orders.12L9ZFwhzVpuEKMUNUqkaTLGzwY9G24tbiigLiXpmZWKwmcNDDs finished
2020-02-26T17:50:39.848Z INFO orders.12EayRS2V1kEsWESU9QMRseFhdxYxKicsiFmxrsLZHeLUtdps3S finished
2020-02-26T17:50:39.945Z INFO orders.1wFTAgs9DP5RSnCqKV1eLf6N9wtk4EAtmN5DpSxcs8EjT69tGE finished
2020-02-26T17:50:40.186Z INFO orders.118UWpMCHzs6CvSgWd9BfFVjw5K9pZbJjkfZJexMtSkmKxvvAW finished 2020-02-26T17:50:40.301Z INFO orders.121RTSDpyNZVcEU84Ticf2L1ntiuUimbWgfATz21tuvgk3vzoA6 finished
2020-02-26T17:51:06.767Z INFO piecestore deleted {“Satellite ID”: “12EayRS2V1kEsWESU9QMRseFhdxYxKicsiFmxrsLZHeLUtdps3S”, “Piece ID”: “WYG5I7WK4A26E5GTVFRX3VG45QUHYMS6XCVTPGARZP$WJYY75YHA”}
2020-02-26T17:51:17.899Z ERROR orders cleaning archive {“error”: “ordersdb error: database disk image is malformed”, “errorVerbose”: “ordersdb error: database disk image is m$lformed\n\tstorj.io/storj/storagenode/storagenodedb.(*ordersDB).CleanArchive:321\n\tstorj.io/storj/storagenode/orders.(*Service).cleanArchive:161\n\tstorj.io/storj/storagenode/orders.(*Servic$
).Run.func2:146\n\tstorj.io/common/sync2.(*Cycle).Run:147\n\tstorj.io/common/sync2.(*Cycle).Start.func1:68\n\tgolang.org/x/sync/errgroup.(*Group).Go.func1:57”}

I’ve ran the database checks and there seems to be some erros, here’s a sample for the errors

…
On tree page 30010 cell 382: Rowid 482672 out of order
On tree page 30010 cell 381: Rowid 482664 out of order
On tree page 30010 cell 380: Rowid 482656 out of order
On tree page 30010 cell 379: Rowid 482648 out of order
On tree page 30010 cell 378: Rowid 482640 out of order
On tree page 30010 cell 377: Rowid 482632 out of order
On tree page 30010 cell 375: Rowid 482616 out of order
On tree page 30010 cell 374: Rowid 482608 out of order
On tree page 30010 cell 373: Rowid 482600 out of order
On tree page 30010 cell 372: Rowid 482592 out of order
On tree page 30010 cell 371: Rowid 482584 out of order
On tree page 30010 cell 369: Rowid 482568 out of order

I was able to recover it but I’m worried there’s some problems with this node. Is there anything I can do to debug the cause?

Alexey · February 26, 2020, 9:27pm

Usually the database can be corrupted if:

you use Unraid without rollback or fix;
your disk has the write cache enabled;
you have had a power loss and do not have a UPS;
you use the external USB disk and the USB controller is overheated or do not have enough power for HDD;
you use any kind of network connected drive;
you used the forced stop instead of docker stop -t 300 storagenode

Gank · February 26, 2020, 9:54pm

These disks do have the write cache enabled!
I’ve disabled it and restart the node, will monitor for further errors.

Thanks!

Gank · February 27, 2020, 10:25am

After checking all the databases for this node I still get

2020-02-27T10:21:45.072Z        INFO    version running on version v0.33.4
2020-02-27T10:21:45.085Z        INFO    db.migration    Database Version        {"version": 31}
Error: Error during preflight check for storagenode databases: storage node preflight database error: database disk image is malformed                                                                  storj.io/storj/storagenode/storagenodedb.(*DB).Preflight:314
        main.cmdRun:198                                                                                                                                                                                 storj.io/storj/pkg/process.cleanup.func1.2:307
        storj.io/storj/pkg/process.cleanup.func1:325                                                                                                                                                    github.com/spf13/cobra.(*Command).execute:826
        github.com/spf13/cobra.(*Command).ExecuteC:914                                                                                                                                                  github.com/spf13/cobra.(*Command).Execute:864
        storj.io/storj/pkg/process.ExecWithCustomConfig:84                                                                                                                                              storj.io/storj/pkg/process.ExecCustomDebug:66
        main.main:328                                                                                                                                                                                   runtime.main:203

I’m running this command for all the databases and the output is just “Ok”,

sqlite3 *.db "PRAGMA integrity_check;

Alexey · February 27, 2020, 9:09pm

You must explicitly specify the name of each database. This application can work only with one file at once.