Database database heldamount does not exist

mauricio · October 2, 2020, 7:35am

Hello,

Yesterday, I noticed that my storj docker container we restarting all the time so I checked the docker logs to see what was going on.

I claimed that I was running an old version of Storj (I’m running v0.34.6). I did not expect this message since I had configured automatic updates.

Then I tried to update manually using these steps: https://documentation.storj.io/setup/cli/software-updates.

When I try to start the container again I get the message: " Error: Error starting master database on storagenode: database: database heldamount does not exist stat config/storage/heldamount.db: no such file or directory"

Does any one know what I need to do to get the nodes running again?

Thanks, for you help.
Mauricio

rikysya · October 2, 2020, 1:26pm

Hello!

With the most recent release, create config backup mv config.yaml config.yaml.bak. Run the container so it goes through setup process and create missing dbs. Stop the container, restore config mv config.yaml.bak > config.yaml, start container again.
That should help!

Regards,
Yaroslav!

mauricio · October 2, 2020, 5:46pm

So what command do I need to use to start the container again? When I follow your recipe and using docker run to start the container I get an error: “container name already in use”.

Can you point me to step by step documentation on how to get this right? I had configured automatic upgrades and didn’t look back. Did I miss an announcement that automatic updates were broken or not supported anymore?

I now have more than 24 hours downtime and I feel this is through no fault of my own. This is quite frustrating.

baker · October 2, 2020, 6:15pm

You should be able to start the container by using
docker start storagenode
You only need to use the docker run command when you have removed the container. Afterward can you please show the result of docker ps -a so we can try to see why your automatic updates failed?

—edit— Step by step following Yarolav’s procedure.

docker stop -t 300 storagenode

create config backup

mv config.yaml config.yaml.bak

run the container

docker start storagenode

Watch the logs to see when the container is done the setup process

docker logs -f --tail 40 storagenode

stop storagenode when(/if) it is done migration. Look for the start up uploads/downloads.

docker stop -t 300 storagenode

If you never modified any of the config.yaml options, you do not need to restore the old config. If you did modify config.yaml options, restore the old config.yaml, or modify the new one.

mv config.yaml.bak > config.yaml

start the container again

docker start storagenode

mauricio · October 2, 2020, 7:06pm

Still no luck.

I am running several nodes each on their own public ip-address. My nodes are called node-00, node-01, node-02, etc.

When I try to perform the recipe. I am certain that the config.yaml file does not exist. Yet I keep on getting there same error message.

What else can I try?

Thanks.

baker · October 2, 2020, 7:23pm

Let’s start with one of the nodes. I am assuming they are all running on different machines. Can you please run
docker ps -a
on the node you are currently working on?

Alexey · October 3, 2020, 7:27am

Hello @mauricio,
Welcome to the forum!

Please, do the following:

Stop the storagenode
Execute either with a local sqlite3 or with a docker version (you can take the command from there: https://support.storj.io/hc/en-us/articles/360029309111), replace the /path/to/ to your actual path:

sqlite3 /path/to/heldamount.db

When you see a sqlite> prompt execute this script:

CREATE TABLE versions (version int, commited_at text);
CREATE TABLE paystubs (
                                                  period text NOT NULL,
                                                  satellite_id bytea NOT NULL,
                                                  created_at timestamp NOT NULL,
                                                  codes text NOT NULL,
                                                  usage_at_rest double precision NOT NULL,
                                                  usage_get bigint NOT NULL,
                                                  usage_put bigint NOT NULL,
                                                  usage_get_repair bigint NOT NULL,
                                                  usage_put_repair bigint NOT NULL,
                                                  usage_get_audit bigint NOT NULL,
                                                  comp_at_rest bigint NOT NULL,
                                                  comp_get bigint NOT NULL,
                                                  comp_put bigint NOT NULL,
                                                  comp_get_repair bigint NOT NULL,
                                                  comp_put_repair bigint NOT NULL,
                                                  comp_get_audit bigint NOT NULL,
                                                  surge_percent bigint NOT NULL,
                                                  held bigint NOT NULL,
                                                  owed bigint NOT NULL,
                                                  disposed bigint NOT NULL,
                                                  paid bigint NOT NULL,
                                                  PRIMARY KEY ( period, satellite_id )
                                );
CREATE TABLE payments (
                                                id bigserial NOT NULL,
                                                created_at timestamp NOT NULL,
                                                satellite_id bytea NOT NULL,
                                                period text,
                                                amount bigint NOT NULL,
                                                receipt text,
                                                notes text,
                                                PRIMARY KEY ( id )
                                        );
.exit

Start the storagenode
Check your logs

mauricio · October 4, 2020, 9:57am

Hello Alexey,

following your suggestions I seem to get one step further. The error message is now:

Error: Error starting master database on storagenode: database: database pricing does not exist stat config/storage/pricing.db: no such file or directory

So heldamount.db seems to be now opened correctly. The issue is now pricing.db

Could you please share the scripts needed to create this database and any others that I’ll probably be needing?

Thanks a million.

Cheers,

Maurice

Alexey · October 4, 2020, 11:22am

If you lost everyone database, I’m wonder is your data still there?

I do not have a script for every database “on any case”, but I can generate them when the name of database is known.

So, if you lost all (or most of) databases, then it’s better to allow the storagenode to create them all:

Stop storagenode
Move remained databases to the safe place, for example (replace the path /path/to/ to your actual path with data):

mkdir /path/to/storage/backup
mv /path/to/storage/*.db /path/to/storage/backup/

Create a backup of config.yaml

mv /path/to/config.yaml /path/to/config.yaml.bak

Run such container:

docker run -it --rm -v /path/to:/app/config storjlabs/storagenode:latest

It must fail, but it will create an empty databases and default config.yaml

Now copy databases and config back from the backup:

cp /path/to/storage/backup/*.db /path/to/storage/
cp /path/to/config.yaml.bak /path/to/config.yaml

Try to start the storagenode
Check your logs

mauricio · October 4, 2020, 2:15pm

Ok, I am trying your suggestion and making backups. It is taking a while so I hope the data is still there.

Just incase it matters, these are the databases I have
– 1 root root 32768 Oct 1 21:24 revocations.db
-rw-r–r-- 1 root root 12906496 Oct 1 15:48 bandwidth.db
drwx------ 3 root root 74 Oct 4 13:09 blobs
-rw-r–r-- 1 root root 16384 Oct 1 15:48 info.db
-rw-r–r-- 1 root root 24576 Oct 1 15:48 notifications.db
-rw-r–r-- 1 root root 312180736 Oct 1 15:48 orders.db
-rw-r–r-- 1 root root 36864 Oct 1 15:48 piece_expiration.db
-rw-r–r-- 1 root root 24576 Oct 1 15:48 piece_spaced_used.db
-rw-r–r-- 1 root root 24576 Oct 1 15:48 pieceinfo.db
-rw-r–r-- 1 root root 20480 Oct 1 15:48 reputation.db
-rw-r–r-- 1 root root 32768 Oct 1 15:48 satellites.db
-rw-r–r-- 1 root root 262144 Oct 1 15:48 storage_usage.db
-rw-r–r-- 1 root root 41574400 Oct 1 15:48 used_serial.db

I will let you know how things turn out.

Alexey · October 4, 2020, 2:22pm

You should backup only databases, not blobs folder (because there almost all your node’s data).
To move 11 databases it should not take more than a few seconds.

mauricio · October 4, 2020, 3:49pm

I’ve backed up the database for node-05 and node-06 and then I get my original error message. Error: Error starting master database on storagenode: database: database heldamount does not exist stat config/storage/heldamount.db: no such file or directory

So it appears that for some reason, the databases I need are not being created.

As I would have expected docker ps -a gives
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
8f7740a5fabd storjlabs/storagenode:latest “/entrypoint” 4 minutes ago Exited (1) 2 minutes ago node-05
88b7ae6727df storjlabs/storagenode:latest “/entrypoint” 12 minutes ago Exited (1) 4 minutes ago node-06
bca89f7088e1 storjlabs/storagenode:beta “/entrypoint” 6 months ago Exited (1) 45 hours ago node-07
931ef584cfb9 storjlabs/storagenode:beta “/entrypoint” 6 months ago Exited (1) 45 hours ago node-04
4aa3de4ca0ad storjlabs/storagenode:beta “/entrypoint” 6 months ago Exited (1) 45 hours ago node-03
acd7e07a9698 storjlabs/storagenode:beta “/entrypoint” 6 months ago Exited (1) 45 hours ago node-02
e8de23cc398d storjlabs/storagenode:beta “/entrypoint” 6 months ago Exited (1) 45 hours ago node-01

This is what I would have expected as I’ve been trying to solve the problem for node-05 and node-06 first before I mess with the other nodes. All the nodes have their own public IP address, but I moved them to their the same machine to save power when I noticed there was very light load on the machines.

Alexey · October 4, 2020, 4:02pm

If you would follow the instruction, it will create an empty databases.

Stop the storagenode (only one)
Move its databases to backup subfolder
Move config.yaml to config.yaml.bak
Run a new temporary container below (it has limited number of arguments, it’s important!), replace the /path/to to your actual data path for that node and run the container:

docker run -it --rm -v /path/to:/app/config storjlabs/storagenode:latest

It will throw an error and exit.

Copy databases and config back from the backup to their places (use your paths!) with overwrite:

cp /path/to/storage/backup/*.db /path/to/storage/
cp /path/to/config.yaml.bak /path/to/config.yaml

Try to start that node
Check its logs

If you want to do the same for other nodes, you can repeat steps, but do not forget to change paths.
Also, you can cleanup failed containers:

docker rm 8f7740a5fabd 88b7ae6727df bca89f7088e1 931ef584cfb9 4aa3de4ca0ad acd7e07a9698 e8de23cc398d

mauricio · October 6, 2020, 3:36pm

So, I followed your latest instructions and finally got the nodes up and running again.

Thanks a lot for your help.

I guess I’ll need avoid the automatic updates since after working fine for a while, this mechanism kinda wedged my setup last week.

I guess, the reputation system will now penalize me? I do not think that this would be fair.

Alexey · October 6, 2020, 7:11pm

You should be able to see your audit score and suspension score on the dashboard.
The corrupted or lost databases do not prevent to pass audits. So, if your data is intact - there is nothing to worry about.
On edge case you are lost a local stat.