Is it possible to merge Databases?

Hi,

I have had a few instances where my DBs got currupted and then had to be repaired. During that time the StorJ node has to be offline to prevent any loss of statistics.

Is there a way to just combine 2 Databases at the end to save downtime?

Lets take bandwidth.db as an example:

  1. bandwidth.db is currupted.
  2. Storagenode is shut down and Database saved somewhere else to be repaired.
  3. New bandwidth.db ist created and node started back up
  4. Repair is slowly performend over many many hours while node is running
  5. Once Repair is done shutdown node. Take both bandwidth.dbs and run a merge command
  6. Put the resulting bandwidth.db in the correct place and start node back up.

Due to the way smaller size of the new database as its just a few hours or maybe 1-2 days old it should decrease downtime by a lot.
my currupt database is currently more than 3GB big and repairing it is taking many many hours even in a ramdisk that I have set up.

I currently have the following Idea but am no expert on this topic and don’t just wanna experiment all willy nilly and destroy something:

use sqlite3 for both the repaired database and the new one to create a dump_all_1 and dump_all_2:

sqlite3 ./bandwidth_repaired.db
.mode insert
.output /storage/dump_all_1.sql
.dump
.exit

sqlite3 ./bandwidth_new.db
.mode insert
.output /storage/dump_all_2.sql
.dump
.exit

then use:

sqlite3 /storage/bandwidth.db ".read /storage/dump_all_1.sql"

and

sqlite3 /storage/bandwidth.db ".read /storage/dump_all_2.sql"

that should in theory just add both together right? Or is there something I am missing?

AFAIU technical possibility exist, but it’s not going to be as simple as what you suggest—you will end up with duplicate entries for the rollup table, which you’d need to sum up together from both instances of the database instead of just having side by side.

Other database files will probably also need some special treatment.

In the end, it’s not rocket science as long as you know the schema.

1 Like

Aren’t the databases disposable? Like you can delete them and they’ll be recreated and your local stats will just be off until the end of the month (when they reset)?

2 Likes

There is also the bandwidth shown in the “payout” tab under “Download” / “Repair/Audit”.
You can look up those values for older months.

If this “bandwidth.db” really only saves the info for that bar of bandwidth on the main screen for the current month then I wouldn’t mind. But I would like to have accurate values on the payout screen.

Could you maybe tell me where the values from the payout screen come from?

If you mean the Payout History - it’s received from the satellites as a receipt after the payout is completed.
The detailed payout screen with a storage and bandwidth usage in the selected period is taken from your local databases, and they unlikely will be back ported from the satellites.

Answering on your initial question - yes, it’s possible, but you need to import data with skipping duplicates. If you want to - you may try, for example: SQLite Forum: Avoiding adding duplicate entries

1 Like

Thanks for the explanation!

1 Like