Bug in new Vetted Date

So Storj should be able to see this? Like for every active node ID, compare the vetting status across the satellites: and if any satellite shows unvetted for around 3 months passed the vetting date of the other satellites… something has probably gone wrong?

Or passed 100 audits…

Not vetted on EU1

INFO    reputation:service      node scores updated     {"Process": "storagenode", "Satellite ID": "12L9ZFwhzVpuEKMUNUqkaTLGzwY9G24tbiigLiXpmZWKwmcNDDs", "Total Audits": 141581, "Successful Audits": 140596, "Audit Score": 1, "Online Score": 0.9972973598461851, "Suspension Score": 1, "Audit Score Delta": 0, "Online Score Delta": 0, "Suspension Score Delta": 0}

I can’t run that script because it sais jq is not installed. I don’t know what that is and I won’t install it.

Confirming the bug. Same on all my nodes too.

It is this: jq is like sed for JSON data - you can use it to slice and filter and map and transform structured data with the same ease that sed , awk , grep and friends let you play with text.
Nice tool if you need to do some JSON parsing in Bash for example.

Had a peek at the code. First the reputation table is updated, then the node table—but the latter only if a node was “recently” vetted. It is the node table that gets consulted on whether the node is vetted or not for the purposes of directing traffic.

Tried to follow the code, but not knowing the details of deployment, and having only cursory understanding of the dbx tool, it’s difficult for me. But I have a feeling that the condition in hasReputationChanged might be the key. It seems to be that this function tries to avoid unnecessary updates to the node table, but this also means that if one update to the node table fails (e.g., when the node was “recently” vetted), then subsequent positive audits will not have a chance. And given that the node table may be behind a write cache that stores data asynchronously, ApplyAudit may finish with no error to report despite the change was not actually persisted.

Curiously, if this hypothesis is true, then a node operator could retrigger the update to the node table by having their node be suspended, whether through audits or being offline. Obviously that’s a rather drastic measure.

Being offline for how long?

If you started at 100%… I think it took around 12 days offline to hit Suspension (60%)?

So… you can put the unvetted sat on the ban list in config, until you get suspended. Got it!

# list of trust exclusions
storage2.trust.exclusions: "12L9ZFwhzVpuEKMUNUqkaTLGzwY9G24tbiigLiXpmZWKwmcNDDs@eu1.storj.io:7777"

Please note this is just a hypothesis.

Nothing to loose, just 300GB of data. We will see after 2 weeks if it worked.

You should be able to tell pretty quickly: if your online score for that satellite doesn’t start dropping around 3%/day… then your node is still passing its online checks (even if it isn’t accepting data connections)

But yeah if you can get suspended… and unsuspending fixes your vetting… that would be great to know!

If a satellite is blacklisted, how would the node receive score updates :rofl:

It could trigger a disqualification, not suspension. These events are independent of each other.
But theoretically, the rejection of audits should be interpreted as offline.

Well, to stop all suppositions…
After like 2-3hours I got suspended, even before receiving the “Node offline” email. Something is wrong with these audits…
But… @Toyoo you are a Code Wizard level 50. The node is now VETTED. :partying_face: :face_blowing_a_kiss:

 WARN reputation:service node scores worsened {"Process": "storagenode", "Satellite ID": "12L9ZFwhzVpuEKMUNUqkaTLGzwY9G24tbiigLiXpmZWKwmcNDDs", "Total Audits": 141722, "Successful Audits": 140684, "Audit Score": 1, "Online Score": 0.9971701333830808, "Suspension Score": 0.28895593993885077, "Audit Score Delta": 0, "Online Score Delta": -0.00012722646310425745, "Suspension Score Delta": -0.7110440600611492}

So, seeing that score, it could reach 0% in less than 24h. Could you be DQ-ed so quickly? Why?
What’s the difference between blacklisting the sat and being offline because of some blackout?

The suspension score drops if the node is available, answering on audit requests, but returns some unknown error (not “file not found”, not wrong hash and not “I/o error”). If the error is known, then the audit score drops instead.
The offline score drops if the node is unavailable (“connection timeout”).

So, looks like works as expected.

I do not think so, but may be.

Ou, I got it… My sleepy head didn’t realise the obvious; the satellite can see if the node is online or not, independently of the blacklisting.

Exactly. But I was afraid, that the audit rejecting is included to the known errors list.

By the way, it also confirmed that the suspension score below 60% do not disqualify the node.

So before everyone jumps into blacklisting the unvetted sat, maybe take note of these things and wait for my node to recover. I hope it will.
But if you decide to risk it, just use a real email address in your run command and keep an eye for the “your node got suspended email” - it can take maybe 2 - 3 hours.
Than remove the blacklisting in config and restart the node.
You can edit the config while node is running, save it and then:

docker stop -t 300 storagenode
docker restart -t 300 storagenode

WAIT 5 min before checking the dashboard, to let it update the status.
I won’t mark Toyoo’s solution as the solution, because I want people to read the followings, but if you consider otherwise, you can mark it.

Great. Let’s go offline everybody! :partying_face: