Disqualification amendment suggestion

nommiiss · January 31, 2020, 10:36am

So I have been thinking about this and I really dont like the idea of a Discolification of a storj node due to bad performance and would suggest to the storj team an amendment to this idea. The Idea come to me during a RAID controller failer on my server. I retained the data on the disks however, my Node has been on and offline for a while over the last few days. At the moemnt I am aware that there is no discolification enabled however, If there was I would probably be taken offline and this got me thinking… Though i was having trouble I would not like to create the storj node from scratch and I would propose 3 levels of operation for a storj node with no disqualifications at all…This will allow the storj operator time to get his node online and working to a level that is specified…
This is what I thought…

Fully Operatinal state
Degradation State
Recovery State

Fully Operatinal state 100%-95% operation uptime per month… node fully working reciving and sending data.

Degradation State 94%-80% operation uptime per month… node not reciving any more data however is still able to send data.

Recovery State 79%- operation uptime per month… node starts to “safly sends its data back into” the network.

the Thrird state is almost a discolification state however, if the operator has some unforseen issues with hardware power etc and know they will recover the node they have the option of keeping the node online until it gets a seffitiant rating.

Kind Regards Simon

Vadim · January 31, 2020, 10:59am

what to do if you loosing data all the time, repear is expencive

Storgeez · January 31, 2020, 11:08am

Your “Recovery state” is basically graceful exit if I understood you correctly, this is implemented.

Your “Degradation state” can be containment mode or also maintenance mode, one exists, second is on the roadmap as far as I’m aware.

nerdatwork · January 31, 2020, 11:24am

You cannot opt graceful exit if your node has lost data. OP’s recovery state has 79% data left on node.

BrightSilence · January 31, 2020, 11:37am

The percentage is uptime, not data left. I think some tiers like suggested may be a good option. I’m not sure the force GE is really feasible though as it could be abused.

Also…

Sorry, couldn’t help myself

nommiiss · January 31, 2020, 11:45am

Its just an idea… as i said I just dont like the idea there are always flaws and more ideas and suggestions…

nerdatwork · January 31, 2020, 11:50am

Keep them coming

Pentium100 · January 31, 2020, 1:00pm

Yea, the 5 hours concern me too, I mean, I can keep the node with a downtime of 1 hour/year, but then there is a power outage 5 hours longer than my UPSs can handle or some problem that requires physical access while I am on vacation and my node is dead.

Unlike a datacenter, I do not have employees within easy reach of the servers all the time.

A bit similar with data loss, that is, at least allow partial graceful exit. Let’s say my HDD messed up or there was a power failure and I lost about 100MB of files out of 4TB. Is it really better for the network to repair all those 4TB after disqualifying my node?

Vadim · January 31, 2020, 1:07pm

Agree with @Pentium100 than repear 4TB insted of 100MB is very bad aproach. It should be some penalties for data lost, but loosing all escrow is too much, espesialy if repear cost of small ammount is 10$ and hole scrow is 200$ for example. But insted of make repear for 10$ Network will make overhead for all data.

BrightSilence · January 31, 2020, 1:40pm

Probably, yes.

The satellite doesn’t know exactly which files you lost. It only knows you failed a certain amount of audits recently. Based on that information alone it needs to determine how likely it is that your node will not be reliable into the future. If you missed quite a few audits, there is no way the satellite can trust your node to hold the other pieces reliably.

Now lets say your node can do some checking on what data was lost and send a list of lost pieces to the satellite. The satellite could repair those (perhaps at cost to you through escrow). It would no longer audit those pieces and your node would be fine…
But I would also definitely start abusing that system and mod my node to say it has lost pieces that are never downloaded and start to optimize my storage on my node to only keep data that is often downloaded. Probably not what you want.

LarsOS · January 31, 2020, 3:44pm

In my opinion, this behavior should ideally be balanced through the reward system, not discouraged through artificial technical limitations.

BrightSilence · January 31, 2020, 3:45pm

Sure, how do you propose the satellite would know the difference between me cheating and another node having lost some data?

LarsOS · January 31, 2020, 3:48pm

I don’t. Instead I propose your incentive to cheat is reduced by balancing the reward system. Through for example a higher storage payment and a lower egress payment.

I’m not saying that it’s going to be easy to balance this, mind you, hence my use of the word “ideally”.

If you give people incentive to cheat, and then you go out of your way to stop them cheating, you’re just creating a lot of work for yourself. Then it might be better to not give people an incentive to cheat in the first place.

BrightSilence · January 31, 2020, 3:53pm

We’re getting paid for what creates value. This is directly tied to what customers pay for tardigrade. You can’t change one side without changing the other and in order to remain competitive, you can’t change the other side.

I also really can’t think of many situations in which you would lose some data while still storing the rest in a reliable way.

Even better, don’t even give them the possibility to cheat. Audits are independently verified by the satellites and can’t be faked. If you keep a strict “if you receive data you have to reliably store that data” policy, there is no possibility to cheat at all. Any further addition just complicates things. So yes, best to not open that door at all and just expect from nodes that they never lose data.

LarsOS · January 31, 2020, 3:59pm

I have read several posts from SNOs suggesting that when their node is full and egress dwindles, they want to do a graceful exit and start a new node. Perhaps this won’t be an issue with v3. Perhaps it should not be counted as cheating. Perhaps the losses will outweigh the gains for the SNOs so they won’t bother. And perhaps even if they do, it’s totally OK with the network. I guess time will tell.

BrightSilence · January 31, 2020, 4:06pm

They will have to wait 15 months soon enough if they want to do that. This is exactly to eliminate abuse of graceful exit in order to do similar kinds of “cheating”.

Pentium100 · January 31, 2020, 6:21pm

What I meant was that when a node fails enough audits (which is probably a low number) or I notice that I may have lost some data, partial graceful exit gets triggered.

Essentially, the satellite saying “You lost some data, you cannot be trusted, give me all the data that you still have and go away”.

I’m pretty sure that this is due to how the devs test things and in production you will want the node to have a lot of data.