"satellites" table missing on the database after unexpected shutdown

Step 42 (last one?):

“no such table: used_serial_”

2020-08-04T08:04:38.396Z INFO Telemetry enabled
2020-08-04T08:04:38.403Z INFO db.migration.34 Add suspended field to satellites db
2020-08-04T08:04:38.413Z INFO db.migration.36 Add joined_at field to satellites db
2020-08-04T08:04:38.414Z INFO db.migration.37 Drop payments table as unused
2020-08-04T08:04:38.418Z INFO db.migration.38 Backfill joined_at column
2020-08-04T08:04:38.419Z INFO db.migration.39 Add unknown_audit_reputation_alpha and unknown_audit_reputation_beta fields to satellites db
2020-08-04T08:04:38.420Z INFO db.migration.40 Add unknown_audit_reputation_score field to satellites db
2020-08-04T08:04:38.421Z INFO db.migration.41 Make satellite_id foreign key in satellite_exit_progress table
2020-08-04T08:04:38.455Z INFO db.migration.42 Drop used serials table
Error: Error creating tables for master database on storagenode: migrate: no such table: used_serial_
storj.io/storj/storagenode/storagenodedb.(*DB).Migration.func10:1484
storj.io/storj/private/migrate.Func.Run:285
storj.io/storj/private/migrate.(*Migration).Run.func1:179
storj.io/storj/private/dbutil/txutil.withTxOnce:75
storj.io/storj/private/dbutil/txutil.WithTx:36
storj.io/storj/private/migrate.(*Migration).Run:178
storj.io/storj/storagenode/storagenodedb.(*DB).MigrateToLatest:295
main.cmdRun:183
storj.io/private/process.cleanup.func1.4:353
storj.io/private/process.cleanup.func1:371
github.com/spf13/cobra.(*Command).execute:840
github.com/spf13/cobra.(*Command).ExecuteC:945
github.com/spf13/cobra.(*Command).Execute:885
storj.io/private/process.ExecWithCustomConfig:88
storj.io/private/process.ExecCustomDebug:70
main.main:323
runtime.main:203

Thank you very much.

That one is a bit more tricky as my node already doesn’t have that table anymore. Lets hope it’ll just accept a dummy table.

Rename used_serial.db

sqlite3 used_serial.db

CREATE TABLE versions (version int, commited_at text);
CREATE TABLE used_serial_ (col int);

.quit

Here comes the big one:

2020-08-04T08:23:15.115Z INFO Telemetry enabled
2020-08-04T08:23:15.124Z INFO db.migration.42 Drop used serials table
2020-08-04T08:23:15.148Z INFO db.migration.43 Add table payments
2020-08-04T08:23:15.167Z INFO db.migration Database Version {"version": 43}
Error: Error during preflight check for storagenode databases: storage node preflight database error: pieceinfo: expected schema does not match actual:   &dbschema.Schema{
-  Tables: []*dbschema.Table{
-  s"Name: pieceinfo_\nColumns:\n\tName: deletion_failed_at\n\tType: TIMESTAMP\n\tNullable: true\n\tDefault: \"\"\n\tReference: nil\n\tName: order_limit\n\tType: BLOB\n\tNullable: false\n\tDefault: \"\"\n\tReference: nil\n\tName: piece_creation\n\tType: TIMESTAMP\n\tNullable: false\n\tDefault: \"\"\n\tReference: nil\n\tName: piece_expiration\n\tType: TIMESTAMP\n\tNullable: true\n\tDefault: \"\"\n\tReference: nil\n\tName: piece_id\n\tType: BLOB\n\tNullable: false\n\tDefault: \"\"\n\tReference: nil\n\tName: piece_size\n\tType: BIGINT\n\tNullable: false\n\tDefault: \"\"\n\tReference: nil\n\tName: satellite_id\n\tType: BLOB\n\tNullable: false\n\tDefault: \"\"\n\tReference: nil\n\tName: uplink_cert_id\n\tType: INTEGER\n\tNullable: false\n\tDefault: \"\"\n\tReference: Reference<Table: certificate, Column: cert_id, OnDelete: , OnUpdate: >\n\tName: uplink_piece_hash\n\tType: BLOB\n\tNullable: false\n\tDefault: \"\"\n\tReference: nil\nPrimaryKey: \nUniques:\n\t",
-  },
+  Tables: nil,
-  Indexes: []*dbschema.Index{
-  s`Index<Table: pieceinfo_, Name: idx_pieceinfo__expiration, Columns: piece_expiration, Unique: false, Partial: "piece_expiration IS NOT NULL">`,
-  s`Index<Table: pieceinfo_, Name: pk_pieceinfo_, Columns: satellite_id piece_id, Unique: true, Partial: "">`,
-  },
+  Indexes: nil,
  }

storj.io/storj/storagenode/storagenodedb.(*DB).Preflight:346
main.cmdRun:193
storj.io/private/process.cleanup.func1.4:353
storj.io/private/process.cleanup.func1:371
github.com/spf13/cobra.(*Command).execute:840
github.com/spf13/cobra.(*Command).ExecuteC:945
github.com/spf13/cobra.(*Command).Execute:885
storj.io/private/process.ExecWithCustomConfig:88
storj.io/private/process.ExecCustomDebug:70
main.main:323
runtime.main:203

Let’s hope this does the trick. Though I worry with such significant damage to databases, there may be other issues.

Rename pieceinfo.db

sqlite3 pieceinfo.db

CREATE TABLE pieceinfo_ (
                                                satellite_id     BLOB      NOT NULL,
                                                piece_id         BLOB      NOT NULL,
                                                piece_size       BIGINT    NOT NULL,
                                                piece_expiration TIMESTAMP,

                                                order_limit       BLOB    NOT NULL,
                                                uplink_piece_hash BLOB    NOT NULL,
                                                uplink_cert_id    INTEGER NOT NULL,

                                                deletion_failed_at TIMESTAMP,
                                                piece_creation TIMESTAMP NOT NULL,

                                                FOREIGN KEY(uplink_cert_id) REFERENCES certificate(cert_id)
                                        );
CREATE UNIQUE INDEX pk_pieceinfo_ ON pieceinfo_(satellite_id, piece_id);
CREATE INDEX idx_pieceinfo__expiration ON pieceinfo_(piece_expiration) WHERE piece_expiration IS NOT NULL;

.quit

I thing your right, somthing significant has happend to the databse. Another schema error:

2020-08-04T09:00:29.204Z INFO Telemetry enabled
2020-08-04T09:00:29.210Z INFO db.migration Database Version {"version": 43}
Error: Error during preflight check for storagenode databases: storage node preflight database error: storage_usage: expected schema does not match actual:   &dbschema.Schema{
-  Tables: []*dbschema.Table{
-  s"Name: storage_usage\nColumns:\n\tName: at_rest_total\n\tType: REAL\n\tNullable: false\n\tDefault: \"\"\n\tReference: nil\n\tName: interval_start\n\tType: TIMESTAMP\n\tNullable: false\n\tDefault: \"\"\n\tReference: nil\n\tName: satellite_id\n\tType: BLOB\n\tNullable: false\n\tDefault: \"\"\n\tReference: nil\nPrimaryKey: interval_start satellite_id\nUniques:\n\t",
-  },
+  Tables: nil,
   Indexes: nil,
  }

storj.io/storj/storagenode/storagenodedb.(*DB).Preflight:346
main.cmdRun:193
storj.io/private/process.cleanup.func1.4:353
storj.io/private/process.cleanup.func1:371
github.com/spf13/cobra.(*Command).execute:840
github.com/spf13/cobra.(*Command).ExecuteC:945
github.com/spf13/cobra.(*Command).Execute:885
storj.io/private/process.ExecWithCustomConfig:88
storj.io/private/process.ExecCustomDebug:70
main.main:323
runtime.main:203

The schemas I use are from v1.9.5, are you already on that version? It could help to update if not.

I thinks I’m in the same version because I download the latest Docker image from the official Docker Hub repository yesterday. There is any way to know exactly what version im running if do it with a docker container?

the docker image has been updated, so running docker pull storjlabs/storagenode:latest would do it. Make sure your run command also uses the :latest tag.

I have remove the container and the image. Then I pull the tagged “latest” image (Image ID: 18b91207b1ab) using the command you type. After that I run the container and get the same schema error.

oh whoops, I overlooked that it’s actually a different schema. Let’s just keep going then I guess.

Rename storage_usage.db

sqlite3 storage_usage.db

CREATE TABLE storage_usage (
                                                satellite_id BLOB NOT NULL,
                                                at_rest_total REAL NOT NULL,
                                                interval_start TIMESTAMP NOT NULL,
                                                PRIMARY KEY (satellite_id, interval_start)
                                        );

.quit

In the mean time, I just want to double check. Is it possible that you changed the path to storage and perhaps created new empty db files? It seems every db file is broken. That is highly unlikely to happen on a sudden shut down as a lot of these db’s are very infrequently written to.

I’m thinking you may have created empty db files by trying to do a consistency check on them. With that in mind, you should probably check all paths and have a look at file sizes as well as the size of the corresponding blobs folder. If we start the node with a wrong path or no data after finally recreating all db’s, your node will be disqualified within minutes. So lets make sure to check that first.

I checked the paths and they are fine. Blobs folder has the size expected, so I try something.

I have moved all the db files to another temporary folder, and then run the node without them. This is what I get:

2020-08-04T11:05:26.567Z INFO Telemetry enabled
2020-08-04T11:05:27.086Z INFO db.migration Database Created {"version": 43}
2020-08-04T11:05:27.662Z INFO preflight:localtime start checking local system clock with trusted satellites' system clock.
2020-08-04T11:05:28.547Z INFO preflight:localtime local system clock is in sync with trusted satellites' system clock.
2020-08-04T11:05:28.547Z INFO bandwidth Performing bandwidth usage rollups
2020-08-04T11:05:28.547Z INFO trust Scheduling next refresh {"after": "6h56m50.354746389s"}
2020-08-04T11:05:28.548Z INFO Node 1v7Zw-------------------------------------------eqY1 started
2020-08-04T11:05:28.548Z INFO Public server started on [::]:28967
2020-08-04T11:05:28.548Z INFO Private server started on 127.0.0.1:7778
2020-08-04T11:05:29.528Z INFO piecestore download started {"Piece ID": "QYEXVYV4OZOHDQBFCL3EV2Z2AAHUNRLUAHFIOCPB4VLRUEY45YPA", "Satellite ID": "12L9ZFwhzVpuEKMUNUqkaTLGzwY9G24tbiigLiXpmZWKwmcNDDs", "Action": "GET"}
2020-08-04T11:05:30.019Z INFO piecestore downloaded {"Piece ID": "QYEXVYV4OZOHDQBFCL3EV2Z2AAHUNRLUAHFIOCPB4VLRUEY45YPA", "Satellite ID": "12L9ZFwhzVpuEKMUNUqkaTLGzwY9G24tbiigLiXpmZWKwmcNDDs", "Action": "GET"}
2020-08-04T11:05:30.019Z INFO piecestore downloaded {"Piece ID": "QYEXVYV4OZOHDQBFCL3EV2Z2AAHUNRLUAHFIOCPB4VLRUEY45YPA", "Satellite ID": "12L9ZFwhzVpuEKMUNUqkaTLGzwY9G24tbiigLiXpmZWKwmcNDDs", "Action": "GET"}
2020-08-04T11:05:32.251Z INFO piecestore download started {"Piece ID": "3Z5IOFWZYM2XLDGZGQW2CFV43UFYDN22XQZ3S63JSYXHE5PDAGQA", "Satellite ID": "12L9ZFwhzVpuEKMUNUqkaTLGzwY9G24tbiigLiXpmZWKwmcNDDs", "Action": "GET"}
2020-08-04T11:05:32.418Z INFO piecestore downloaded {"Piece ID": "3Z5IOFWZYM2XLDGZGQW2CFV43UFYDN22XQZ3S63JSYXHE5PDAGQA", "Satellite ID": "12L9ZFwhzVpuEKMUNUqkaTLGzwY9G24tbiigLiXpmZWKwmcNDDs", "Action": "GET"}
2020-08-04T11:05:34.701Z INFO piecestore download started {"Piece ID": "YATRQLVNS25AWAQVHUIAT7JPRMJIMBKBZW6GXTINXBSAHSA7MVNQ", "Satellite ID": "12EayRS2V1kEsWESU9QMRseFhdxYxKicsiFmxrsLZHeLUtdps3S", "Action": "GET"}
2020-08-04T11:05:34.936Z INFO piecestore download started {"Piece ID": "OOOZ7ECRKBKXSPHZNUIUOGJ7I3QAWOPLNAR4H6VAXOOZ6TOR5IHA", "Satellite ID": "12EayRS2V1kEsWESU9QMRseFhdxYxKicsiFmxrsLZHeLUtdps3S", "Action": "GET"}
2020-08-04T11:05:36.158Z INFO piecestore download started {"Piece ID": "F4BD3ZNYC67TZIR3GTOEUBTRFZIARS4PZ5NNPP4KKTN6MVMH7EQQ", "Satellite ID":
..... 

After some seconds I stopped the node fearing about be disqualified. I check the folders again and the db files were regenerated. What conclusion can get from this? The node could regenerate the database files and run normally? On the other hand, after some internal checks the node will be disqualified or not?

This may actually be your simplest solution. You can have a look at the original db files that you backed up and the ones you hadn’t yet changed. I’m guessing they are all very small.

We were basically one by one doing the same thing the node now did automatically. When there are no db files, the node recreates them as empty. You’d be missing all your old stats so the dashboard will likely show wrong data. But we were kind of heading that way anyway.

The upside is that the db data is all non-essential. The most important part is that the actual customer data is still there. And from the successful downloads in the logs you showed, they are.

So we could continue going one by one, but it seems we would effectively end up with empty new databases one way or the other. You might as well keep it running with the new automatically created databases.

The node is running and everything is all right after be watching logs. There is no errors so I am happy of it. The statistics data has been lost and do not show it on dashboard but it does not matter.

I will have an eye on it for 24 hours and if everything is right y will close the topic as solved.

Thank you @BrightSilence and @alexey for your time. I appreciate your job on this community. This would not be the same without people like you.

Thank you very much.

1 Like

hi all
I have the same problem after auto update. what can be done?

2020-08-04T13:37:49.950Z INFO Telemetry enabled
2020-08-04T13:37:49.957Z INFO db.migration Database Version {“version”: 43}
Error: Error during preflight check for storagenode databases: storage node preflight database error: orders: expected schema does not match actual: &dbschema.Schema{
Tables: *dbschema.Table{&{Name: “order_archive_”, Columns: *dbschema.Column{&{Name: “archived_at”, Type: “TIMESTAMP”}, &{Name: “order_limit_serialized”, Type: “BLOB”}, &{Name: “order_serialized”, Type: “BLOB”}, &{Name: “satellite_id”, Type: “BLOB”}, &{Name: “serial_number”, Type: “BLOB”}, &{Name: “status”, Type: “INTEGER”}, &{Name: “uplink_cert_id”, Type: “INTEGER”, Reference: &dbschema.Reference{Table: “certificate”, Column: “cert_id”}}}}, &{Name: “unsent_order”, Columns: *dbschema.Column{&{Name: “order_limit_expiration”, Type: “TIMESTAMP”}, &{Name: “order_limit_serialized”, Type: “BLOB”}, &{Name: “order_serialized”, Type: “BLOB”}, &{Name: “satellite_id”, Type: “BLOB”}, &{Name: “serial_number”, Type: “BLOB”}, &{Name: “uplink_cert_id”, Type: “INTEGER”, Reference: &dbschema.Reference{Table: “certificate”, Column: “cert_id”}}}}},
Indexes: *dbschema.Index{
&{Name: “idx_order_archived_at”, Table: “order_archive_”, Columns: string{“archived_at”}},
&{
Name: “idx_orders”,
Table: “unsent_order”,
Columns: string{“satellite_id”, “serial_number”},

  •   	Unique:  true,
    
  •   	Unique:  false,
      	Partial: "",
      },
    
    },
    }

Lets investigate first instead of going down that rabbit hole again.

Could you show the result of:

sqlite3 /path/to/orders.db .schema

I have already defeated the problem, thanks)

Could you please share how so others may use the same method?

Just to be sure, I don’t recommend just starting over with new db files unless you have to. I don’t want to give people the idea that that is usually a good solution.

Hey @FoRTu,

I understand that this post was indeed the solution for your situation, but if you mark this one as the solution could you add something to the post to caution people that this is only a last resort if the data from the databases is already lost? In most cases it’s much better to fix the specific db file that’s having an issue and I wouldn’t want other SNOs just throwing away their db’s.

You are right @BrightSilence.

In my case the solution was to stop the node and remove all .db files from storage folder. Then I started again the node and it regenerate all the .db files and start working fine like before.

The downside of remove the .db files is that statistics and Dashboard information was lost. It is preferable to restore the .db files one by one, as @BrightSilence said, instead of remove the .db files.

This worked in my case, not all issues with the .db files are the same. Use this topic as usefull information not a definitive solution, and before following the steps I followed, explane your case to the community.

4 Likes