Storagenode Recovery Mode

jammerdan · May 17, 2020, 8:35am

Right. This way it makes sense at least in a technical way.

Alexey · May 17, 2020, 4:10pm

You are right, the current system will download a piece to check its hash.
I tried to think how it could be expanded to the block of data

LarsOS · May 17, 2020, 5:39pm

What if the satellite instead of storing a hash for each data block, stores a list of salted hashes for every data block, keeping the salts secret.

When a node needs to reconfirm a piece of data, the satellite reveals one of the salts to the node. The node then must use the salt with the data block, compute a hash out of that, and return it to the satellite, which then can compare the salted hash from the node with its own salted hash.

That way, the node cannot just pre-compute a hash and delete the data, it will need to have the actual data in order to combine it with the hitherto unknown salt, and create a valid hash.

Alexey · May 17, 2020, 5:47pm

You can read details regarding audits here:

LarsOS · May 17, 2020, 5:58pm

Wow much clever. Thanks for the link.

Pentium100 · May 17, 2020, 6:57pm

If I got it right, an audit means the satellite downloads all pieces of the segment and checks to see if any of them are corrupted.

What if a hash of every piece was stored? The satellite could download a piece from my node, check its hash against the database and know if my node has corrupted the data or not.

jammerdan · February 22, 2021, 8:21am

I’d still support an idea of node recovery for 2 reasons:

If a SNO has to start over he must make the decision if it is worth it and may leave forever. That is not favorable.
The life of a node depends on a HDD and any HDD will fail at some point. So in a sense we are all in the same boat and sooner or later anyone will face it and will have to deal with a lost node.

These would be my main reason why I believe there should be a way to limit the damage done to a SNO who has done nothing wrong but the hard disk fails that he has not to start over again at 0.

Maybe once

repair jobs get cheaper there can be found a way or there could be a way nodes could help each other out directly.

Alexey · February 22, 2021, 10:09am

The node cannot be a repair worker at the moment, because it’s requires trust between the satellite and the node, we are not there yet.
I think the repair worker would not be combined with the storagenode in a foreseeable future, but it can be a separate service and maybe rewarded, but not now.

jammerdan · February 22, 2021, 10:25am

Yes I see that. Problem is cost and trust. But costs could be reduced:

New traffic type something like graceful “recover” with lower or non egress costs for Storj Labs.
Like @littleskunk has quoted moving repair service to hosts with lower upload costs.
[3. Direct repair between nodes (maybe some day, maybe never)]

But with 1 and 2 already there could be set a price for repair much less than today and a SNO who needs to recover could decide if it would be worth for him to pay that price.

(Again: Such service should only be offered to SNOs who have done nothing wrong and experienced a HDD crash which can happen to any of us any time to help them recover. So some kind of deterrence to intentionally crash a node should be applicable.)

And to avoid the hassles that @BrightSilence has laid out with partial recover, maybe only full recovery should be allowed?

Alexey · February 22, 2021, 10:57am

There’s only one problem is remaining: neither node nor satellite are knows what pieces are lost. To figure that out the satellite must do a whole audit all data on your node for the same cost as repair ($10/TB).
But in case of audit there is no guarantee that data is intact and also no one piece will be repaired.
So for the network it’s much cheaper just repair missed pieces, when their number would fall below the threshold.

jammerdan · February 22, 2021, 11:19am

I agree. If neither node nor satellite nor anybody else knows what pieces have been lost, then it is difficult to recover them.

Pentium100 · February 22, 2021, 11:30am

There could be a way to know that:

If the drive fails and such, you can assume that all pieces on that node have been lost.
If only some pieces were lost (bad sectors or the node was restored from a backup), then it could go something like this - satellite sends node a list of which files the node should have, the node checks if those files exist and reports back a list of files it does not have. After that, the satellite does repair on those files and audits the rest.

However, it also looks to me that if the satellite stored the hash of every piece, it could audit a node without involving other nodes, thus saving on bandwidth in these cases.

Alexey · February 22, 2021, 11:45am

The satellite do not know, what pieces the exact node should have. It is a heavy request to extract them from the segments (a minimum data piece used by the satellite), then if the satellite will send me a list of pieces what I should have, I’ll modify the storagenode software to say “yes, I have all of them”.

It can’t save the costs. To audit the node it must download all the pieces.

Pentium100 · February 22, 2021, 12:34pm

And then the result would be exactly the same as right now - the node wil continue operating until it fails enough audits and is disqualified.

This is a biger problem IMO, if it is not possible to get a list of pieces a node should have (and their hashes so it is possible to verify a single piece without reassembling the segment).

jammerdan · February 22, 2021, 12:41pm

It is not in the databases on the node?

Pentium100 · February 22, 2021, 12:43pm

We are talking about a situation where the database could be corrupted or outdated.

jammerdan · February 22, 2021, 1:00pm

I wonder if this is necessarily the case?
My scenario was the idea where for example the database is on another disk like a SSD and only the data on the data disk dies. Would the database in such a case remain good and would a good database then allow to retrieve the required list of pieces for recovery?
If this is the case then protecting the database more from becoming corrupted or outdated might be a good step forward.

Alexey · February 22, 2021, 2:47pm

No. The satellite will not trust the node which managed to lost pieces. So, this information is useless for the audit/repair purpose.

sembeth · February 22, 2021, 6:09pm

I am glad that my old idea got some attention again.

First of all, I hope you are all well.

I may ask what is the lifespan of a storagenode that does not run raid? Like the many raspberrys and others.

Shall we buy a raspberry and hdd and run it until:

The hdd fails completely. We buy another hdd and start from 0?
The hdd gets many damaged vectors and we fail the audits. Do we wait until the node is disqualified and then start from 0?

If I have a node that brings in 20 every month and it fails in parts or completely, I would like to buy a new hdd and transfer all files that are readable to the new hard drive and do a full scan of all files and re-download missing or corrupt files. I would even pay for it. Because if I have to start from 0 with the node, it takes time for vetting and get data again. To get back to 20 it would take months.

Yes, you can monitor your hard drive’s stats and read smart data. But hard drives fail at random. You can’t predict when it will fail completely. Otherwise, you wouldn’t need raid and backups.

Toyoo · February 22, 2021, 8:24pm

(emphasis mine) So, are there plans to introduce a protocol to maintain this kind of trust? Maybe instead of the satellite auditing chunks at random, the satellite would audit whether nodes are correctly auditing. Like:

Satellite tells a node to audit these 100 chunks over the next day. The node gives the information back as soon as it’s completed.
Satellite then selects 5 of those chunks at random, audits them, and if there is mismatch in results between satellite audit and node’s report, the node is considered untrustworthy.

20× reduction of satellite audit traffic. Same for repairs.

Judging from Backblaze data, if kept in proper conditions, the average will be longer than it will be worth maintaining the drive.