Storagenode Recovery Mode

This is very interesting to read. So I want to add my thoughts:

  1. From Storj perspective a node that has proven to be unreliable can no longer be trusted. So even if the SNO could fire up a new node, the environment can be totally different and reliable in the future. Also by starting over it is a form of long term commitment. This means it is a self selection process: A SNO that selects to get into this commitment again is rather trustworthy. It is the node that is not trusted, not the SNO.

  2. From the perspective of other nodes especially new nodes I can understand if they favor a drop-out of unreliable nodes. If a node loses 10 TB of data, this data will get distributed so they might get a piece of this cake. Also I think of the case when satellite has already started to redistribute data to other nodes an the failed node comes back online with “his” data pieces. I understand it that suddenly there are more pieces of a file in the network than needed. So what does it mean. It means lesser chance for other (reliable) SNOs to get a download. And maybe (but that I don’t know) satellite even deletes excess pieces. So I don’t know if this is a fair procedure.

  3. From the perspective of the SNO I can fully understand the desire to recover data. But I am not sure if local backups are the solution. First of all they require space that could be used as node. Second they require maintenance and resources to verify data integrity in case data gets recovered from a local backup. Third, it sounds really silly: Tardigrade is advertised as secure redundant online data store and nodes keep offline backups. They don’t trust Tardigrade?

That said, I believe the only way to recover a node would be to look at a node as a customer: A node that wants to recover data must download it from the Tardigrade network like a regular customer. And for data integrity, he must download the his entire data. So if he wants to restore a 10 TB node, he must download “his” complete 10 TB and of course pay for it like a regular customer. (Consider it as the opposite of a graceful exit: A graceful entry)
This solves many problems:

  1. Recovery does not put cost on Storj
  2. Other SNOs will be happy for every recovery because they get paid for downloads
  3. SNO can recover a node, and get his amount of data back instead of waiting long time
  4. High cost prevent SNO to abuse the system for multiple cheap recovery so it is still in their interest to run a reliable node.
  5. Node data can be trusted.

Maybe some computations must be made by the satellites to mitigate the loss of pieces that have been already distributed to other reliable nodes. (Meaning that pieces that have already been redistributed might no get restored on that node to be recovered)

The same concept could be used for moving nodes. I always thought that it is a bit silly if you move nodes to or from different places that you have to move the data as well. With a recovery mode, you would simply move the identity and metadata and download the pieces from the network. At least you could choose to do so.

There is also a final thought: I am aware that data loss can happen anytime. So every SNO can be affected any time. It could happen right now that my HD crashes and dies. If this is true I am just wondering, if repair traffic must have a price at all. As a SNO one day I profit for being paid for repair traffic, but the other day, I might profit from being able to restore my node for free or for low cost. So maybe recovery downloads should be free for Storj and seen as shared risk among all SNOs? Basically this is a case like every other daily life offline insurance and could be treated as such.