Suspension score dropped with no errors, then reverted

No i checked even previous logs and as i said there are not any more logs with errors than those i said above. I run the command in all logs that were used while i was offline and i know that because the logs before don’t have any errors.

From a statistical point of view though, watching all those guys getting suspended lately yeah its pretty obvious that there are some bugs going on

2 Likes

Yup, definitely more likely there’s some system bugs affecting less than 0.1% of all nodes rather than those 0.1% of nodes have some kind of internet or node issue :+1:

This is weird. You should have some errors anyway. The suspension score should not be affected silently.
It maybe database locks or something other, but must be.

That’s what I thought, there must be some log entries, but based on what I searched I could not find any either. Although mine did not drop nearly as drastically.

Well i keep getting these errors

ERROR    piecestore    failed to add bandwidth usage    {"error": "bandwidthdb: database is locked"

But now there is no suspension % drops

you have problem with bandwidth database, it not affect online and audit, but better if you fixt it, i think it woldnt show you your used trafic then.

Ok how should i fix it? What do you mean bandwidth database?

You should not generalize.

This error can affect suspension score if happened during GET_AUDIT or GET_REPAIR

Ok this is new information for me. Thanks, cans you post link how to fix it?

There are not so many fixes for that:

  • do not place database files to the system disk, unless it’s SSD;
  • do not place database files to the SMR disk;
  • do not place database files to the network attached drive, even iSCSI (latency could be big, depends).

I had the same thing happen to me on three nodes. All better now. I believe STORJ did something (change) and broke it – I took NO action (as I was not home when this happened) and it resolved itself within a day or two.

Suspension scores do tend to get back to 100% without user intervention, especially if it is a transient problem like an ISP routing or network equipment issue. So it is unlikely that Storj broke something, then silently fixed it, especially if there was no new release recently (last release was 9 days ago) and no mention of a bugfix in the next release.

That is unrelated to this issue.

Me, too facing the exact problem. All of a sudden suspension with no reason.

Could you please show your dashboard?
There are two reasons for suspension:

  • online score fall below 60%
  • suspension score fall below 60%

Dear Alexey,

no I cannot ^^. I just opend my dashboards of all nodes and all of a sudden, all suspensions are gone…

I faced a similar problem 4 days ago when I receive emails that 2 of my nodes were suspended from saltlake. But the dashboards of those node showed nothing.

Now I received an email of one node to be suspended on europe-west-1. In fact three of my nodes had suspension notes on several satelites in the dashboard.

But right now all dashboard show a clean nice OK an all satelites.

Looking on the general nodestatus worldwide, it seams that this morning a huge jump in suspesions happed. I guest the system is having trouble somehow, not my nodes specially.

Same again here on eu 1 this morning. Surely this is a wider issue. It doesn’t seem to be isolated. To add, Email was timed at 01.50 GMT and by 11.10 AM its back at 100% from 54.08%.

Please clarify, which score is dropped?
This thread for online score.

I do apologise, It was suspension i was talking about.

Good to know on those names. I will submit an issue. I received one for the European sat also.
Whats odd is my scores are all sitting at 100% and I am seeing normal amounts of get and put activity from all sats.

Lastly the link in the email pointed to troubleshooting for offline errors, there doesnt seem to me anything for audit error troubleshooting