Database disk image malformed after a crash

Hi,
I had a node crash and after reboot my node doesn’t start enymore, lokking my log i find this error:

Error: Error during preflight check for storagenode databases: storage node preflight database error: database disk image is malformed
storj.io/storj/storagenode/storagenodedb.(*DB).Preflight:327
main.cmdRun:198
storj.io/storj/pkg/process.cleanup.func1.2:312
storj.io/storj/pkg/process.cleanup.func1:330
github.com/spf13/cobra.(*Command).execute:826
github.com/spf13/cobra.(*Command).ExecuteC:914
github.com/spf13/cobra.(*Command).Execute:864
storj.io/storj/pkg/process.ExecWithCustomConfig:84
storj.io/storj/pkg/process.ExecCustomDebug:66
main.main:328
runtime.main:203

How I can solve it ?

thx

Please, use this article:


and check all databases to figure out which one is malformed.
Then fix it, as described.

I have use the procedure but on orders.db I get this error:

Error: near line 3796082: no such table: order_archive_

Error: near line 3796083: no such table: order_archive_

Error: near line 3796084: unable to open database file

Error: near line 3796085: no such table: versions

Error: near line 3796086: unable to open database file

Error: near line 3796087: no such table: main.order_archive_

what I can do ?

I have a backup of all .db files but 2 days old, can I restore it ?

Make sure that you used version of sqlite3 not older than v3.25.2.
If it’s, then your orders.db is heavily corrupted. In such case we will recreate it.

  1. Stop the storagenode
  2. Remove the orders.db
  3. Execute either with a local sqlite3 or with a docker:
sqlite3 orders.db
CREATE TABLE unsent_order (
                                                satellite_id  BLOB NOT NULL,
                                                serial_number BLOB NOT NULL,

                                                order_limit_serialized BLOB      NOT NULL, -- serialized pb.OrderLimit
                                                order_serialized       BLOB      NOT NULL, -- serialized pb.Order
                                                order_limit_expiration TIMESTAMP NOT NULL, -- when is the deadline for sending it

                                                uplink_cert_id INTEGER NOT NULL,

                                                FOREIGN KEY(uplink_cert_id) REFERENCES certificate(cert_id)
                                        );
CREATE TABLE order_archive_ (
                                                satellite_id  BLOB NOT NULL,
                                                serial_number BLOB NOT NULL,

                                                order_limit_serialized BLOB NOT NULL,
                                                order_serialized       BLOB NOT NULL,

                                                uplink_cert_id INTEGER NOT NULL,

                                                status      INTEGER   NOT NULL,
                                                archived_at TIMESTAMP NOT NULL,

                                                FOREIGN KEY(uplink_cert_id) REFERENCES certificate(cert_id)
                                        );
CREATE UNIQUE INDEX idx_orders ON unsent_order(satellite_id, serial_number);
CREATE TABLE versions (version int, commited_at text);
CREATE INDEX idx_order_archived_at ON order_archive_(archived_at);
.exit
  1. Try to start storagenode
  2. Look into logs

same problem with used_serial.db too

Error: near line 168045: UNIQUE constraint failed: used_serial_.satellite_id, used_serial_.serial_number

You can ignore that error.
You need to check an integrity for all databases, and you should try to fix only malformed databases, not everyone.

ok, running the first command on order.db

CREATE TABLE unsent_order (
satellite_id BLOB NOT NULL,
serial_number BLOB NOT NULL,
order_limit_serialized BLOB NOT NULL, – serialized pb.OrderLimit
order_serialized BLOB NOT NULL, – serialized pb.Order
order_limit_expiration TIMESTAMP NOT NULL, – when is the deadline for s
ending it
uplink_cert_id INTEGER NOT NULL,
FOREIGN KEY(uplink_cert_id) REFERENCES certificate(cert_id)
);

I have receive the error:

Error: unknown column “uplink_cert_id” in foreign key definition

sqlite3 version

docker run --rm -it --mount type=bind,source="/mnt/storage",destination=/storage sstc/sqlite3 sh
/data # sqlite3 --version

3.30.1 2019-10-10 20:19:45 18db032d058f1436ce3dea84081f4ee5a0f2259ad97301d43c426bc7f3dfalt2

Sorry, the excess line break. I edited the SQL in my post.

sqlite3 orders.db
CREATE TABLE unsent_order (
                                                satellite_id  BLOB NOT NULL,
                                                serial_number BLOB NOT NULL,

                                                order_limit_serialized BLOB      NOT NULL, -- serialized pb.OrderLimit
                                                order_serialized       BLOB      NOT NULL, -- serialized pb.Order
                                                order_limit_expiration TIMESTAMP NOT NULL, -- when is the deadline for sending it

                                                uplink_cert_id INTEGER NOT NULL,

                                                FOREIGN KEY(uplink_cert_id) REFERENCES certificate(cert_id)
                                        );
CREATE TABLE order_archive_ (
                                                satellite_id  BLOB NOT NULL,
                                                serial_number BLOB NOT NULL,

                                                order_limit_serialized BLOB NOT NULL,
                                                order_serialized       BLOB NOT NULL,

                                                uplink_cert_id INTEGER NOT NULL,

                                                status      INTEGER   NOT NULL,
                                                archived_at TIMESTAMP NOT NULL,

                                                FOREIGN KEY(uplink_cert_id) REFERENCES certificate(cert_id)
                                        );
CREATE UNIQUE INDEX idx_orders ON unsent_order(satellite_id, serial_number);
CREATE TABLE versions (version int, commited_at text);
CREATE INDEX idx_order_archived_at ON order_archive_(archived_at);
.exit

I have successful execute the command but the problem is still there:

fabrizio@storenode : /mnt/storage $ docker logs --tail 20 storagenode

2020-04-07T23:45:03.139Z INFO operator wallet: 0xB8f8a2A65916b1442B824833Cfd6101910f8Ab7b

2020-04-07T23:45:03.758Z INFO version running on version v1.0.1

2020-04-07T23:45:03.769Z INFO db.migration.27 Add index archived_at to ordersDB

Error: Error creating tables for master database on storagenode: migrate: no such table: main.order_archive_

storj.io/storj/private/migrate.SQL.Run:251

storj.io/storj/private/migrate.(*Migration).Run.func1:171

storj.io/storj/private/dbutil/txutil.withTxOnce:67

storj.io/storj/private/dbutil/txutil.WithTx:36

storj.io/storj/private/migrate.(*Migration).Run:170

storj.io/storj/storagenode/storagenodedb.(*DB).CreateTables:285

main.cmdRun:188

storj.io/storj/pkg/process.cleanup.func1.2:312

storj.io/storj/pkg/process.cleanup.func1:330

github.com/spf13/cobra.(*Command).execute:826

github.com/spf13/cobra.(*Command).ExecuteC:914

github.com/spf13/cobra.(*Command).Execute:864

storj.io/storj/pkg/process.ExecWithCustomConfig:84

storj.io/storj/pkg/process.ExecCustomDebug:66

main.main:328

runtime.main:203

what’s happen if i restore my .db files backuped 2 days ago ?

Why aren’t nodes fixing their databases automatically?

Please, stop the storagenode, remove the database orders.db and try again. Also, make sure that you created the orders.db in the actual storagenode storage’s location.

You will start from the beginning of fixing them.

Too many different cases, how SNO could break a database. It’s much better to configure node correctly and such problems will not happen. We have thousands SNOs, but only dozen have different problems because of non optimal configuration.

We trying to help people to fix their configuration, otherwise their node will be disqualified sooner or later because of mistake in configuration.

3 Likes

I have try again:

This is the starting point on storagenode storage’s location:

fabrizio@storenode:/mnt/storage$ ll
total 2350864
drwxr-xr-x 7 root root 4096 Apr 8 10:04 ./
drwxr-xr-x 8 root root 4096 Apr 7 10:18 …/
-rw-r–r-- 1 root root 7905280 Apr 7 12:48 bandwidth.db
drwxr-xr-x 2 root root 4096 Jul 16 2019 blob/
drwxr-xr-x 7 root root 4096 Feb 11 21:36 blobs/
-rw-r–r-- 1 root root 29556308 Apr 8 00:10 dump_all.sql
-rw-r–r-- 1 root root 29556281 Apr 8 00:10 dump_all_notrans.sql
drwxr-xr-x 2 root root 4096 Apr 7 09:05 garbage/
-rw-r–r-- 1 root root 32768 Apr 7 12:00 heldamount.db
-rw-r–r-- 1 root root 24576 Apr 7 11:58 info.db
-rw-r–r-- 1 root root 24576 Apr 7 12:00 notifications.db
-rw-r–r-- 1 root root 8192 Apr 8 08:32 orders.db
-rw-r–r-- 1 root root 73728 Apr 7 12:00 piece_expiration.db
-rw-r–r-- 1 root root 24576 Apr 7 12:00 piece_spaced_used.db
-rw-r–r-- 1 root root 24576 Apr 7 11:58 pieceinfo.db
-rw-r–r-- 1 root root 20480 Apr 7 12:00 reputation.db
-rw-r–r-- 1 root root 32768 Apr 7 11:58 satellites.db
-rw-r–r-- 1 root root 118784 Apr 7 12:00 storage_usage.db
drwxr-xr-x 2 root root 65536 Apr 7 10:08 temp/
drwxr-xr-x 7 root root 4096 Feb 17 10:41 trash/
-rw-r–r-- 1 root root 40951808 Apr 8 01:14 used_serial.db

I have removed the old orders.db

fabrizio@storenode:/mnt/storage$ sudo rm orders.db
[sudo] password for fabrizio:
fabrizio@storenode:/mnt/storage$ ll
total 2350856
drwxr-xr-x 7 root root 4096 Apr 8 10:05 ./
drwxr-xr-x 8 root root 4096 Apr 7 10:18 …/
-rw-r–r-- 1 root root 7905280 Apr 7 12:48 bandwidth.db
drwxr-xr-x 2 root root 4096 Jul 16 2019 blob/
drwxr-xr-x 7 root root 4096 Feb 11 21:36 blobs/
-rw-r–r-- 1 root root 29556308 Apr 8 00:10 dump_all.sql
-rw-r–r-- 1 root root 29556281 Apr 8 00:10 dump_all_notrans.sql
drwxr-xr-x 2 root root 4096 Apr 7 09:05 garbage/
-rw-r–r-- 1 root root 32768 Apr 7 12:00 heldamount.db
-rw-r–r-- 1 root root 24576 Apr 7 11:58 info.db
-rw-r–r-- 1 root root 24576 Apr 7 12:00 notifications.db
-rw-r–r-- 1 root root 73728 Apr 7 12:00 piece_expiration.db
-rw-r–r-- 1 root root 24576 Apr 7 12:00 piece_spaced_used.db
-rw-r–r-- 1 root root 24576 Apr 7 11:58 pieceinfo.db
-rw-r–r-- 1 root root 20480 Apr 7 12:00 reputation.db
-rw-r–r-- 1 root root 32768 Apr 7 11:58 satellites.db
-rw-r–r-- 1 root root 118784 Apr 7 12:00 storage_usage.db
drwxr-xr-x 2 root root 65536 Apr 7 10:08 temp/
drwxr-xr-x 7 root root 4096 Feb 17 10:41 trash/
-rw-r–r-- 1 root root 40951808 Apr 8 01:14 used_serial.db

then run the procedure that you wrote me

fabrizio@storenode:/mnt/storage$ docker run --rm -it --mount type=bind,source="/mnt/storage",destination=/storage sstc/sqlite3 sh
/data # sqlite3 orders.db
SQLite version 3.30.1 2019-10-10 20:19:45
Enter “.help” for usage hints.
sqlite> CREATE TABLE unsent_order (
…> satellite_id BLOB NOT NULL,
…> serial_number BLOB NOT NULL,
…>
…> order_limit_serialized BLOB NOT NULL, – serialized pb.OrderLimit
…> order_serialized BLOB NOT NULL, – serialized pb.Order
…> order_limit_expiration TIMESTAMP NOT NULL, – when is the deadline for sending it
…>
…> uplink_cert_id INTEGER NOT NULL,
…>
…> FOREIGN KEY(uplink_cert_id) REFERENCES certificate(cert_id)
…> );
sqlite> CREATE TABLE order_archive_ (
…> satellite_id BLOB NOT NULL,
…> serial_number BLOB NOT NULL,
…>
…> order_limit_serialized BLOB NOT NULL,
…> order_serialized BLOB NOT NULL,
…>
…> uplink_cert_id INTEGER NOT NULL,
…>
…> status INTEGER NOT NULL,
…> archived_at TIMESTAMP NOT NULL,
…>
…> FOREIGN KEY(uplink_cert_id) REFERENCES certificate(cert_id)
…> );
sqlite> CREATE UNIQUE INDEX idx_orders ON unsent_order(satellite_id, serial_number);
sqlite> CREATE TABLE versions (version int, commited_at text);
sqlite> CREATE INDEX idx_order_archived_at ON order_archive_(archived_at);
sqlite> .exit
/data # exit

this is the final situation:

fabrizio@storenode:/mnt/storage$ ll
total 2350864
drwxr-xr-x 7 root root 4096 Apr 8 10:05 ./
drwxr-xr-x 8 root root 4096 Apr 7 10:18 …/
-rw-r–r-- 1 root root 7905280 Apr 7 12:48 bandwidth.db
drwxr-xr-x 2 root root 4096 Jul 16 2019 blob/
drwxr-xr-x 7 root root 4096 Feb 11 21:36 blobs/
-rw-r–r-- 1 root root 29556308 Apr 8 00:10 dump_all.sql
-rw-r–r-- 1 root root 29556281 Apr 8 00:10 dump_all_notrans.sql
drwxr-xr-x 2 root root 4096 Apr 7 09:05 garbage/
-rw-r–r-- 1 root root 32768 Apr 7 12:00 heldamount.db
-rw-r–r-- 1 root root 24576 Apr 7 11:58 info.db
-rw-r–r-- 1 root root 24576 Apr 7 12:00 notifications.db
-rw-r–r-- 1 root root 8192 Apr 8 10:05 orders.db
-rw-r–r-- 1 root root 73728 Apr 7 12:00 piece_expiration.db
-rw-r–r-- 1 root root 24576 Apr 7 12:00 piece_spaced_used.db
-rw-r–r-- 1 root root 24576 Apr 7 11:58 pieceinfo.db
-rw-r–r-- 1 root root 20480 Apr 7 12:00 reputation.db
-rw-r–r-- 1 root root 32768 Apr 7 11:58 satellites.db
-rw-r–r-- 1 root root 118784 Apr 7 12:00 storage_usage.db
drwxr-xr-x 2 root root 65536 Apr 7 10:08 temp/
drwxr-xr-x 7 root root 4096 Feb 17 10:41 trash/
-rw-r–r-- 1 root root 40951808 Apr 8 01:14 used_serial.db
fabrizio@storenode:/mnt/storage$

then I have run the storagenode

fabrizio@storenode: /mnt/storage $ docker start storagenode
storagenode

this is the log with the same error

fabrizio@storenode : /mnt/storage $ docker logs --tail 20 storagenode
2020-04-08T08:12:04.334Z INFO operator wallet: 0xB8f8a2A65916b1442B824833Cfd6101910f8Ab7b
2020-04-08T08:12:04.999Z INFO version running on version v1.1.1
2020-04-08T08:12:05.008Z INFO db.migration.27 Add index archived_at to ordersDB
Error: Error creating tables for master database on storagenode: migrate: no such table: main.order_archive_
storj.io/storj/private/migrate.SQL.Run:251
storj.io/storj/private/migrate.(*Migration).Run.func1:171
storj.io/storj/private/dbutil/txutil.withTxOnce:67
storj.io/storj/private/dbutil/txutil.WithTx:36
storj.io/storj/private/migrate.(*Migration).Run:170
storj.io/storj/storagenode/storagenodedb.(*DB).CreateTables:285
main.cmdRun:189
storj.io/private/process.cleanup.func1.2:312
storj.io/private/process.cleanup.func1:330
github.com/spf13/cobra.(*Command).execute:840
github.com/spf13/cobra.(*Command).ExecuteC:945
github.com/spf13/cobra.(*Command).Execute:885
storj.io/private/process.ExecWithCustomConfig:84
storj.io/private/process.ExecCustomDebug:66
main.main:329
runtime.main:203

@Alexey Very good spirit, you rock :+1:

1 Like

could you please, show the result of the command:

docker inspect storagenode -f '{{.Mounts}}'

fabrizio@storenode : ~ $ docker inspect storagenode -f ‘{{.Mounts}}’

[{bind /mnt/certs /app/identity true rprivate} {bind /mnt /app/config true rprivate} {volume c5a27f4ea65663ccc1117a71abd663a4ba68d4757da5f8427314becc07ae26be /var/lib/docker/volumes/c5a27f4ea65663ccc1117a71abd663a4ba68d4757da5f8427314becc07ae26be/_data /root/.local/share/storj/storagenode local true }]

Please,

  1. Stop the storagenode
  2. execute
sqlite3 orders.db
drop INDEX idx_order_archived_at;
.exit
  1. Try to run storagenode
  2. Look into logs

I have solve 2 days ago using a *.db files of previous backup.

1 Like