Restarted Server And Node Got Suspended?

We should be a bit more clear if a suspension score doesnt recover after a few hours and the node still not getting any data that means the issue still persists. Suspensions do not last for longer then it has to if the issue doesnt exsist.
But if the issue still exsists then the suspension score does not recover. But if its a unused sat the issue may never reover till this sat is used again.

Must be tough for the node to get suspended:/ without logs it’s impossible to tell what went wrong. But the logs you posted it had something to do with ping failed. I only have one node that is 12 months old. Never had it failed an audit/suspended due a reboot.

Makes sense, so I guess I should give it even more time. Was told to make sure that the node would not fail any audits anymore and suspension score will recover.

Yeah some reason I thought it was US1 and I expected it to recover fast, But its US2 which is hardly used right now. But I wouldnt even worry about that sat since all month my nodes seen maybe 70mb…

1 Like

Suspension scores only change when an audit happens and the satellite sends the new score back to the node. Especially on new nodes that can take a long time if they don’t have much data stored yet. If the issue persists, the score goes down, if not, it goes up. No change means no additional audits have happened and we can’t yet draw conclusions from that then.

2 Likes

According to your posts, your node is still increasing its earnings.

The specific error seems to be listed only once in the source code, in the VerifyOrderLimitSignature function. Start reading from the function comment on line 130:

https://github.com/storj/storj/blob/1c47163eeeba139670f690811b5ac27a159d590e/storagenode/piecestore/verification.go#L130

I don’t think this error has a direct relationship to the Suspension score. Perhaps the reboot of your system screwed up the signature process on one of your orders and at the same time caused a timing issue with the data flow.

1 Like

If there was one wrong time out of millions to reboot a server, that was definitely it. :rofl:

2 Likes

It’s unlikely the reboot was the original cause. For a satellite to suddenly start auditing your node many times in that short a timeframe is basically impossible. I think the score just didn’t update until you rebooted, but the failures must have already happened prior to that. So don’t get too stuck on that reboot.

Any change in score yet? Did you see any additional audits for that satellite in the logs?

3 Likes

No. Pretty much all nodes I have do not get much traffic from this satellite. Same for other posters around here, obviously. Will leave it as is, so when another audit happens, it improves.

Update: node is recovering.
image

5 Likes

is it a very new node?

Yes. It is. Already stated.

Update: node keeps on recovering.
image

5 Likes

Update: node is recovering even further. I believe the issue is no longer relevant.
image
Must have been that restart at bad (sat-wise) timing.

It’s good to see the node recovering!

But I have to reiterate that the restart alone could not have caused this. Sure, that would interrupt ongoing transfers, which may include one or two audits or repairs, but it can’t possible have interrupted the amount of transfers required to drop to a score that low.

If audits or repairs start when your node is offline, they won’t count against your suspension score but against your online score. So being offline can’t have been the cause either.

The restart only caused the score to be updated and prior issues to become visible. Not the underlying problem itself.

Since your node is recovering it may have been a temporary issue anyway, so this may not matter. But I suggest you still keep an eye on things regardless and keep the logs in case something similar happens again.

1 Like