Suggestion: script to rebuild info.db -- somewhat

bacoboy · June 24, 2019, 9:35pm

If found myself in a situation where my node is broken due to a corrupt info.db file, and the instructions here put me in that last category – your node is lost. The suggested course of action is to abandon the node.

Problem is, starting over you lose your repudiation and any tokens being held in escrow as the NEW node you will setup will be like starting from scratch.

Working with @Alexey, I discovered that if you remove your info.db (and corresponding WAL file, etc), the storagenode will create a new one and continue to use the existing blob storage directory.

AUDITs appear to succeed, but the ingress/egress numbers as well as total storage are lost (start from 0). This means, you could go OVER your storage/bandwidth allotment.

Over time, I suspect, the info.db could rebuild it self (maybe?) as audits and GETs are performed, but I know I’m reaching here.

What I’m suggesting instead is that if your metadata sqlite DB is broken beyond repair, there be a script which could scan the blob directory and build a new metadata store (rather than starting from zero). In this way at least the storage used vs total would be accurate (and the ingress/egress would just fix itself when the month rolled over). Otherwise you have to make a guess and lower your total storage number to account for the uncounted pieces you had up until now and never self-correct.

Thoughts? Seems possible…

thepaul · June 26, 2019, 3:49am

Definitely possible, and certainly worthwhile. We will probably wait to start work on that for the storage system to be a little less in flux (otherwise we might have to design the rebuild script all over again). Or, an even better potential outcome would be for us to rework the SN software so that no database file is necessary.

jocelyn · June 26, 2019, 5:56am

tagging @brandon on this suggestion

littleskunk · July 9, 2019, 8:25pm

Just a short preview of what the developers are planing right now.

Why do we store the metadata in a database? At the end the table will still contain over one million rows. Instead we should just add the metadata to the file we write on disk. If a download requests comes in we only have to search for the pieceID on disk. If we find the file we can read the metadata and process everything else as before.

Do we get rid of the info.db? No. We still need to track orders and stuff like that.

What happens if the info.db gets corrupted? The storage node will not submit the orders of the last hour. Storage nodes with a bandwidth cap will lose the information of how much bandwidth was used so far. Beside that the storage node would still find the data and metadata on disk. No audits failures. Great idea isin’t it?

This is only a design draft. I can’t guarantee that it will be implemented and how long that will take.

Alexey · July 9, 2019, 9:34pm

I like the idea. At the moment the info.db is used not only for that. For example, SNO have a disk of 1TB and 600GB is filled up. But he got corrupted the info.db (“is not a database” for example). Now, if he removes it, he will got another problem - not enough space on the disk (because storagenode with an empty info.db can’t recognize the “own” data and “thinks”, that there is only 400GB available).
Can this solution solve this problem too?

BrightSilence · July 9, 2019, 11:15pm

Mismatches between actual disk use and info.db reported disk use have already been noticed by SNO’s. Would it not be better to just look at du on node start and count from there in RAM? That would save a write to the info.db as well. The upside would be that the node better prevents using more data than is available. The downside is that the numbers on the dashboard may include unpaid garbage data.

Of course these changes would throw a wrench into how my earnings calculator works as well, but that’s not a primary concern. I assume better alternatives will be available natively soon anyway.

Alexey · July 10, 2019, 7:32am

Firstly, du would work only while the docker is used. Windows doesn’t have du as well (of course, there is another system function).
We have had the usage of the du in the past (in v2). It leads to long time waiting while the system will calculate usage of the disk. On rpi3 it took more than a few hours to calculate it. We ended up with not calculating it on start.

BrightSilence · July 10, 2019, 7:40am

Oof, I’ve been schooled.
Having a large SSD cache is blinding me from what might take long on spinning disks.

Perhaps it could be a fallback to rebuild the pieces table from the file metadata. It would take a long time once but that’s a lot better than losing that data altogether.

BlackDuck · July 10, 2019, 8:41am

I think the numer of orders which a node can generate in one or some hours is limited to. Why not generate a file for each order and delete it if send. Then there is no need for an database at all.

A database is greate to keep things together and make many queries fast over the same dataset, but is this realy needed?

If the piece got saved with it hashname, there is no need to search him. If the structure is clear, there is no need to search itcostly just access it, if this fail it is not there.

The benefit would be the OS take care for performance, most all spinning disks do NCQ. I would prefer to handle all SN data as files, maybe for some statistics a DB make sense.

littleskunk · July 10, 2019, 2:12pm

I don’t know. That is a different topic. Lets focus on the pieceinfo table first and later we can talk about whats left it the database and how much impact it has.

The structure will be clear. I just don’t know how it will look like.

BlackDuck · July 10, 2019, 2:25pm

Put all meta in files near the data. The table only need a pieceid for peaces with an expiration date, to find them fast for deletion without go over all meta files.

littleskunk · July 10, 2019, 2:26pm

That is the current plan…

bacoboy · July 10, 2019, 5:24pm

The town hall indicated the upcoming Q3 release will remove the ability to do a network wipe. If this is true then a change to the file structure to add this metadata needs to happen soon. Otherwise you’ll need to engineer a local migration which I guess is possible. Just thinking out loud.

Anyway, anything to make it so info.db can be ephemeral/recoverable is welcome. Looking forward to an update!

littleskunk · July 10, 2019, 6:51pm

There is a third option. Store new data in the new format but keep supporting the old format. Keep that for a few month and at some point migrate the remaining files.