This node has 8TB of data. I moved the node into a new case. Swapped some drives around and now when starting, I get this error:
Error starting master database on storagenode: database: database piece_spaced_used does not exist stat config/storage/piece_spaced_used.db: no such file or directory
Boot log:
storagenode | 2020-09-13T20:26:18.453Z INFO Configuration loaded {"Location": "/app/config/config.yaml"}
storagenode | 2020-09-13T20:26:18.467Z INFO Operator email {"Address": "CENSOR"}
storagenode | 2020-09-13T20:26:18.467Z INFO Operator wallet {"Address": "0xCENSOR"}
storagenode | Error: Error starting master database on storagenode: database: database piece_spaced_used does not exist stat config/storage/piece_spaced_used.db: no such file or directory
storagenode | storj.io/storj/storagenode/storagenodedb.(*DB).openExistingDatabase:344
storagenode | storj.io/storj/storagenode/storagenodedb.(*DB).openDatabases:313
storagenode | storj.io/storj/storagenode/storagenodedb.Open:245
storagenode | main.cmdRun:151
storagenode | storj.io/private/process.cleanup.func1.4:353
storagenode | storj.io/private/process.cleanup.func1:371
storagenode | github.com/spf13/cobra.(*Command).execute:840
storagenode | github.com/spf13/cobra.(*Command).ExecuteC:945
storagenode | github.com/spf13/cobra.(*Command).Execute:885
storagenode | storj.io/private/process.ExecWithCustomConfig:88
storagenode | storj.io/private/process.ExecCustomDebug:70
storagenode | main.main:330
storagenode | runtime.main:203
But this really shouldn’t happen in normal operation. Fixing such things can hide underlying problems with the node setup or operation of the node. Which means bigger problems can go unnoticed for longer.
In general it’s probably not a good idea to automatically fix things when the cause is unknown.
Thanks, I ran that command to create the database. It seems I’m missing other databases as well.
When I compare the files in storage to whats in my other nodes, it seems I’m missing many databases.
Here’s a list of what’s missing:
piece_space_used.db (fixed with above query)
notifications.db
pricing.db
reputation.db
satellites.db
storage_usage.db
Is it even recoverable at this point?
Maybe there should be a place in the docs where these queries are stored to recreate all the databases.
If you are missing more than one database, my guess is you have input the wrong path with your docker run command. Please verify the contents of the path exactly as written in your docker run command.
My path is correct. I’m using docker-compose and its the same file I used since before the upgrade. The drive mount path hasn’t changed either. The only causes of these errors I can think of are possible corruption from improper shutdown, or the node attempted to start on the wrong drive at some point while I was upgrading (the latter is unlikely).
I then dropped all the tables that it complained about until the node started running again. sqlite3> DROP TABLE table_name;
As expected, all the information in the dashboard is empty. Can I expect my node’s reputation to suffer, will I be disqualified for having empty databases?
No, that’s a fine approach if the data was missing already. It’s obviously not ideal since you’re definitely missing stats and earnings calculators as well as the dashboard will show incorrect information. some of that will be corrected automatically, like current space usage and reputation information. Which will either be recalculated or retrieved from the satellite. But historic space used and notifications will likely not be recovered.
Luckily all data in your databases are non-essential and you could even delete all databases and your node will still operate just fine. (If there are no databases at all, the node will recreate new empty ones automatically, but this is obviously something you want to avoid)
Edit: I just saw your update that it was disqualified. This is not due to the databases being gone as the node doesn’t need them to succeed audits. Most likely whatever cause half your databases to disappear also impacted the data on your node. So I guess you have some work to do to figure out why data is disappearing.
That’s good to know. So far it’s only been disqualified on one satellite. I have many as peers so I’ll just wait and see whats up. The drive is still full with data pieces so they’re not completely gone. I’ll report back if any more satellites disqualify me.
Edit: the node was offline for over 24 hours, maybe that was enough to disqualify it?
Lots of GET and GET_REPAIR actions. About 1 out of 30 GET actions is failing with file does not exist. The failed actions are happening with many satellites but those same satellites are also getting successful actions. Hopefully, it will sort itself out without too much of a reputation ding.
Only the GET_AUDIT lines will count towards your reputation, but it looks like you have a lot of missing files. I’m afraid your node will likely not survive that. Probably just a matter of time for the other satellites.
So its been about a month, here’s an update. My node was disqualified on 2 satellites because it failed audits. However, the audit is 100% correct on the other 4 satellites.
My question is how should I proceed? Will the 2 satellites I’ve disqualified on eventually take me back? Will they be replaced by other sats? Am I better off killing the node and starting fresh or should I keep it in service for the 4 sats that respect my node?
Maybe we would have a new satellites, but they will not replace the DQed ones, the lasts will remain.
This is up on you. However, you will still be paid for data and traffic of customers of remained satellites if you would leave it running.
Also, you can invoke a Graceful Exit from the remained satellites when your node will be eligible to do so (after 6 months in the network for now, but the initial terms - after 15 months).