Your node has been suspended

Alexey · May 3, 2020, 11:19am

I think it’s a bug or flaw in the architecture and must be fixed.

BrightSilence · May 3, 2020, 11:52am

It might help to vacuum the db and defrag.

pietro · May 3, 2020, 3:13pm

Many thanks! I performed a vacuum on all dbs and restarted the node: I’m not seeing “database is locked” messages anymore.

Suspension has been revoked on 2 out of 3 satellites, now I’m currently suspended only on one satellite, hoping it withdraws the suspension soon.

Anyway, I know my hardware is not up to the job anymore due to the increased traffic, looking forward to receiving the new SATA HDD next week.

BrightSilence · May 3, 2020, 3:15pm

Glad to see that helped. Perhaps vacuuming could become part of the database migration on first run for a new version.

Pac · May 3, 2020, 3:41pm

I think that’s a great idea.

striker43 · May 3, 2020, 4:36pm

My 3 nodes are still running on 1.1.1. Do you think it makes sense to stop watchtower now and wait with the update till 1.3.3 is stable and this issue is solved? How long will 1.1.1 be supported?

nerdatwork · May 3, 2020, 4:40pm

If your watchtower is as per documentation then you need to let watchtower do its job and not force the update.

ImpatientMaker · May 3, 2020, 5:34pm

I hit this error today and ran
for i in $(ls ./dbbak/*.db); do sqlite3 $i "PRAGMA integrity_check;" VACUUM\;; done

Most were ok, but I did get this error.

row 610 missing from index sqlite_autoindex_storage_usage_1 row 898 missing from index sqlite_autoindex_storage_usage_1 row 899 missing from index sqlite_autoindex_storage_usage_1 row 900 missing from index sqlite_autoindex_storage_usage_1 row 901 missing from index sqlite_autoindex_storage_usage_1 row 902 missing from index sqlite_autoindex_storage_usage_1 non-unique entry in index sqlite_autoindex_storage_usage_1 row 1078 missing from index sqlite_autoindex_storage_usage_1 row 1079 missing from index sqlite_autoindex_storage_usage_1 wrong # of entries in index sqlite_autoindex_storage_usage_1

Do you have suggestions for repairing? I’m not familiar with sqlite3

4ich · May 3, 2020, 6:04pm

@pietro and/or @alexey can u explain how to Do the vacuum?

pietro · May 3, 2020, 6:53pm

CyborgCat · May 3, 2020, 7:41pm

Curious. The message “Your node has been suspended” no longer appears, I don’t know what to think. Is the problem from me or from the satellites?

BrightSilence · May 3, 2020, 7:44pm

No it doesn’t, v1.3.3 only added logging for this error. People who are seeing it had the error before except they simply weren’t aware of it because it wasn’t displayed in the logs. Just let watchtower do its thing.

4ich · May 4, 2020, 5:49am

After vacuuming my node seems to have the same trouble with the serialdb…

Lukasz · May 4, 2020, 10:29am

Hi, guys

I got the same issue after update to the v1.3.3 automatically by the watchtower.
I doughy in my case there is the issue referring connection via the USB 2.0.
Currently running storage node on the HP micro-server HDD is connected via the SATA 3.0 6Gb/s.
On Sunday I was suspended on one satellite. Unfortunately today I see on 4.
I took one “europe-north-1.tardigrade.io:7777” to check the time stamp from the main dashboard " suspended on *Mon, 04 May 2020 06:24:37 GMT" According to the dashboard up-time checks and audit checks both are 100%.
Please check the screenshots. I can also upload the logs file if that is necessary.

nerdatwork · May 4, 2020, 10:47am

Can you check if your HDD is SMR or not ?

pietro · May 4, 2020, 11:02am

My node is running fine now, the suspension has been totally removed.

Still some “database is locked” errors in the log, but very very limited: only two errors in 16 hours. Hoping they will go away when I upgrade the hardware in a few days.

CyborgCat · May 4, 2020, 11:03am

The problem reappeared on one of the “europe-north-1.tardigrade.io:7777” satellites, Uptime Checks and Audit Checks are 100%.
HDD is Seagate Backup Plus HUB 8TB, USB 3.0 (possible SMR technology).
I am writing to you in case this information is useful in order to identify the problem.

Frieseba · May 4, 2020, 11:29am

I had the same problem with my Raspberry. Now I have a new, bigger and faster SD card in it. Maybe the old one was broken. Anyway, the download errors have disappeared

Lukasz · May 4, 2020, 2:45pm

Hi, " nerdatwork" thank you to point out that “SMR technology”.
I heard the term. But to be honest, never check properly.

The first is a good explanation video on youtube referring to SMR and HOST-Managed SMR. 2nd and 3rd some additional info referring SMR on the drives but not exactly advertised before by the HDD company.
1.Host-Managed Shingled Magnetic Recording (SMR) Drives - 880
2. Why We Need SMR Listed in Hard Drive Specs
3. Western Digital & Seagates Response to the Unlabeled SMR Drives & Stats From Backblaze

Unfortunately, that is not clear referring to the HDD I have. Probably it is. I got Seagate 6TB “ST6000DM003” If anybody has the same model. Please let me know. If see, a similar issue pattern.

Because that 6TB drive is almost full and I have a few 8TB drives. I was thinking to start the node with 2x8TB in RAID 1.
Now I do not know. If that is a good idea. Does anybody have any more info on how that can impact the storj node? Thank you.

nerdatwork · May 4, 2020, 3:24pm

This is a SMR drive.

Also to tag anyone use @ before their username like @Lukasz This sends out a notification to the user.