Is there such a thing as an "un-disqualify"?

JohnSmith · July 9, 2020, 10:31pm

After close to 15 months, my main node was disqualified and my node 2 has been suspended for some reason. Uptime for all 5 nodes (same server) has been close to 100% and audits have all been 100%. For the small reward, it isn’t worth my time to investigate unless there is a hope to un-disqualify.

It may make more economic and convenience sense to buy and HODL rather than earn and HODL.

Is there such a thing as an “un-disqualify”; or am I done here?

nerdatwork · July 10, 2020, 2:44am

Disqualifications are permanent. In my personal opinion you should hold on to those 4000+ Storj

You need to figure out why your node failed audits.

kevink · July 10, 2020, 5:11am

well I bought new HDDs from my 4000+ STORJ that now work in raidz1 so less risk of DQ because of HDD problems. Expected ROI 12-24 months, which is fine for me.

JohnSmith · July 10, 2020, 7:48am

Nah. I do not “need” to know the reason if there is no coming back from disqualification. Like, what would be the point of expending the effort?

It doesn’t seem like the cost of a graceful exit would be covered by the held back amounts of the remaining 3 nodes, so bump that, too.

It looks like I am done here. 10 TB going offline now. I am sure the entire network will crumble without it (sarcasm).

I’ll pickup about 46,000 STORJ on the next market crash and HODL for forever. It’s been real…inconvenient and a bit frustrating at times as a SNO, but it was also a pleasant way to murder extra time and put my aging server to use. Great project.

Good fortune and luck to all! I’m out.

Pac · July 10, 2020, 9:07am

Another example where a node should probably get suspended before being brutally disqualified without notification…

I believe it’s being worked on, but for now we’re still facing the risk of getting disqualified in a couple of hours.

So frustrating
And unfair as in the other hand being offline for weeks isn’t a problem.
Doesn’t make sense…

StoreMe2 · July 10, 2020, 9:08am

You can come back from disqualification. You have to set up a new node from scratch. That’s it. But its not efficient for SNO.

Yes buying storj would be much more efficient and that’s my plan for the future if I ever would be disqulified. I had some weeks ago offline time of 12 h. My ISP got offline. So i am a SNO since 3,5 years.

It’s just a matter of time the ISP goes for some hours offline for some reason and I think this can happen everywhere in the world, so in the long run I think storj goes offline on private SNO and then storj is hosted only by professional SNO.

I think this is the way storj wants to be or they find a better solution for disqualification and coming back on network again. They are always posting “just use your not used disk space and it’s done” but there so much more than only offer some disk space.

All in all a private SNO do all this just for fun and even when you use a professional and valide setup someday the ISP is fu… up all your work. When this happens i don’t configure a new node from scratch and lost for the network.

I think storj es getting much more centralized in the long run and only distributed but some big and valide SNO. Perhaps amazon distribute some of their space that is not used to storj in some years.

Pac · July 10, 2020, 9:11am

Lol, that’d be amazing news for StorjLabs I guess

donald.m.motsinger · July 10, 2020, 9:59am

Nobody gets disqualified atm for downtime, so that was not the reason for his disqualification.

I don’t get the OP, why he wants to delete 4 nodes, because 1 got disqualified. Try to find out, what the reason was and cut your losses. Then move the data of the oldest remaining node to the biggest hard disk and setup a new node on the now spare hard disk.

A good start for troubleshooting is to grep your logs for “GET_AUDIT” and “failed” in the same line.

JohnSmith · July 10, 2020, 5:16pm

The reward for effort is pennies per hour invested (if that). I know my bill rate. It was interesting, but never was it ever worth my time. All nodes ran on the same aging server which I otherwise only turn on for a few hours monthly. It is madness to invest more time figuring out the “why” if that effort would not change anything.

Cheers!

SGC · July 10, 2020, 7:56pm

This issue really needs to be solved, sure there are a near infinite amount of people willing to attempt to share storage space… but word of mouth will eventually kill the whole project if the majority of SNO’s doesn’t have a good experience…

the facts are simple… SNO’s need time and warning to be able to correct their storagenodes issues, a random bullet in the head isn’t very educational… i suppose that’s why soldiers wears helmets

your valor will be remembered…

kevink · July 10, 2020, 8:31pm

probably not after understanding that their server needs to be online 24/7 and they won’t get rich from sharing space and the rewards during the first months are definitely not worth the effort (if you want to count the spent time).

the majority of SNOs has a good experience. You just read more about the bad experience because people don’t come to the forum screaming “STORJ is so great!! Being a SNO is so easy and I set my node up 5 month ago and it still works and I actually don’t even check the forum!!”.
People come here to scream when they have a problem or their node got DQed for whatever fair or unfair reason.

Can’t disagree with that part though.

BrightSilence · July 10, 2020, 9:40pm

What makes you say that?

Which issue would that be?

Nobody has any idea what happened with this node as the owner doesn’t feel like looking at the logs. Which I would argue takes a lot less time then coming back here to post 3 times.

Clearly audits have not been 100% otherwise you can’t be disqualified. For all we know data was removed or the HDD failed. Either way, I don’t see why we should bother debating a situation we literally know nothing about.

If @JohnSmith wants help or a useful debate, he can look at the logs and let us know what’s going on. But if it’s not worth his time, then it’s definitely not worth ours.

Alexey · July 10, 2020, 9:45pm

Just put it here:

Pac · July 10, 2020, 10:08pm

I get your view @BrightSilence and I tend to agree with you.

However, as @SGC said:

(even though “random bullet” is a debatable notion here)

In my humble opinion, there is only one situation where disqualification could be fast without warning: if a node is clearly responding with the wrong data when audited. This would mean that it’s trying to cheat and is poisoning the network, then it needs to be killed fast.

All other problems should simply trigger the suspension mode, and notify the SNO that something’s wrong and that they should check what’s going on within a few weeks (or months… dunno) before getting disqualified for good.

This said, I do agree that @JohnSmith should (or could) have considered having a look at the logs anyway, to see what happened, especially because if it proved to be a new problem, it could have helped the team to make the software better.

SGC · July 10, 2020, 10:11pm

true… but maybe it should be made more clear what goes wrong then, some sort of system implemented so that the node will protect / disconnect itself upon high numbers of errors / audit failures… so that people actually have a chance to diagnose the issue…

it’s not nice to know that months and even years of maintenance and work could be gone to a random fluke that wasn’t actually anything serious… just because the software doesn’t know how to preserve life atleast long enough for humans to respond.

and the random bullet could very well be a very well considered one from a sniper… it just seems random to the guy in the trench

BrightSilence · July 10, 2020, 10:15pm

Well I think my previous posts show that I think some things can be improved. But I would say if data is removed, that should also lead to a hard disqualification. Given that the storage location is available. The same would go for unreadable data. If the storage location can be read and written to, but files are either unreadable, no longer there or return wrong information. There is nothing you can do to fix it anyway. In those scenarios, disqualification after failing too many audits is just the right thing to do.

@SGC I see you responded as well. But I think this message applies to what you said too. The node needs to be better at knowing whether it’s just that the data location isn’t available or that actual data is lost. Disqualification should only happen in the latter case. If that’s fixed, then there is no question about why the disqualification happened, because it could only be one thing.

For what it’s worth, I’m certain the dashboard will already show an audit score below 60% on the node that is disqualified. But you do have to bother to look.

Pac · July 10, 2020, 10:23pm

I disagree. The SNO could have tried to migrate their nodes and misconfigured the target path. It’d be better to let them know something is wrong, so they can face-palm themselves and solve the issue.

If the node does return data that proves to be invalid, then it’s different: considering each file is identified by some kind of UUID (if I’m not mistaking), I guess that if a node were to target a folder containing data from another node, it shouldn’t find a single file matching requests coming from satellites. Which means that a node returning wrong data probably is cheating. That’s my take anyway.

As @BrightSilence said, there is room for improvement indeed, some may want to upvote one of his suggestions to make the software more robust: Make the node crash with a fatal error if the storage path becomes unavailable

BrightSilence · July 10, 2020, 10:29pm

I would count that as the storage location not being available. The node could place a file in the storage location that it can poll for availability. If it’s not the node shouldn’t start or shut down. That would catch misconfigurations as well. Even better would be a file that stores the public identity of the node in the storage location to test whether the data linked to matches the identity used. That would even catch issues where node A would point to storage location of node B.

But if files are missing but that test file is there, then disqualifying is still the right thing to do.

Alexey · July 10, 2020, 10:30pm

I would like to recommend to read a whitepaper: https://storj.io/storjv3.pdf , section “4.8 Structured file storage” and “4.14.1 Piece hashes”
It’s not a GUID at all. From the storagenode perspective it’s indeed can’t find any audited piece and will answer with “file not found”, not with a wrong data. From the satellite point of view the node lost all the data and must be immediately disqualified, otherwise it will be offered to customers and they could receive the scaring message “file not found” too.

Pac · July 10, 2020, 10:36pm

Sounds like a pretty decent solution to me!

I’m not questionning that, that’s why I assumed that only a cheating node could reply with the wrong data.
If a mechanism like the one @BrightSilence is implemented, then yes, not finding files should lead to disqualification. But currently, a simple misconfiguration could cause that.

If we’re sure the node is cheating or lost all files only, then yes! Otherwise it could be suspended immediately to avoid sending scary ‘file not found’ messages to customer. Don’t you think?