Now it hit me: 'Your node has been suspended'

Alexey · April 30, 2022, 8:00am

No need to post logs without errors, they are way too long. Please use two new lines with three backticks to place logs between, i.e.

```
here is excerpt from the logs
```

It’s a Stefan satellite and it was shutdown two years ago, your node has some stats associated with it, hence the warning. You can ignore it.

I did not get, why you ever deleted trust-cache.json. Did you have some errors related to it?

karacurt · May 1, 2022, 8:44am

I’ve supposed that duck down of suspension stats

"error": "console: trust: satellite \"118UWpMCHzs6CvSgWd9BfFVjw5K9pZbJjkfZJexMtSkmKxvvAW\" is untrusted"

was related to some trust.json file.
Just don’t know what leads to the suspension and trying to make some actions that I’ve made in previous error cases.

When I have started first two nodes in Storj about 3 years ago all the errors caused by the elictricity shutdown or windows updates - and it was the events from my side.
For now I don’t understand why the things goes wrong without obvious reason. The bad events occurs every two month timeframe and it drains alot of time to fix them. But then they are appeared again.

Alexey · May 1, 2022, 9:39am

We increased repair threshold recently to 60, so now we have a lot more repairs going on. The GET_REPAIR can affect suspension and audit scores as it does GET_AUDIT, so if your node answers with unknown error instead of piece, the suspension score will be affected. However, if your node is started to answer on GET_AUDIT or GET_REPAIR normally - the suspension score will be quickly recover.
So, please search for those errors not only for GET_AUDIT but also for GET_REPAIR.
Usually such errors indicates that your node become too slow to answer or maybe there is some disk or network issues.

For example, my nodes doesn’t have problems with GET_REPAIR, even if the number of such requests is now have a higher frequency than before.

karacurt · May 3, 2022, 6:29am

There is no GET_AUDIT or GET_REPAIR issues, only WARN message about Stefan satellite in a normal state, as I have mentioned above.

It is some roaming random situation with suspension downgrade. I’m inspecting all nodes stasts 1 or 2 times a day - and there is no obvious reason for these events.

Could it be because the high capasity of the disks and max bandwidth capasity of SATA III archetecture?
The max capasity of the oldest node is 9Tb of 12Tb total.
And the disks are sitting in a Supermicro JBOD shelf that links with Dell H200E 6Gb/s SAS PCIe HBA controller with the PC.

Alexey · May 3, 2022, 7:07am

There are must be something wrong when your node receives GET_AUDIT or GET_REPAIR, otherwise the suspension score would not be affected to the level where node become suspended.
Maybe you use a docker version without log redirection, then all logs got deleted with the container and failure records are now gone.

If you have logs redirected, then search for GET_AUDIT or GET_REPAIR and ERROR level, i.e.

cat /mnt/storj/storagenode1/storagenode.log | grep -E "GET_AUDIT|GET_REPAIR" | grep ERROR | tail

karacurt · May 3, 2022, 9:05am

I’m using Win10 with Docker on WSL2 based engine.
Docker Desktop 4.6.0 (75818); Docker Engine v20.10.13.
Nothing else than WSL2 have not rolled up.
Looking for logs within Docker Storagenode"N" container.
(Where “N” - is the number of specific Storagenode).

Alexey · May 4, 2022, 7:55am

4 posts were split to a new topic: Can I forward the logs in some specific location not on a Node disk?

karacurt · May 4, 2022, 8:43am

Is it correct:

docker run -d --restart unless-stopped --stop-timeout 300 -p 28969:28967/tcp -p 28969:28967/udp -p 127.0.0.1:14004:14002 -e WALLET="XXX" -e EMAIL="YYY@YYY.YYY" -e ADDRESS="ZZZ:28969" -e STORAGE="11TB" --mount type=bind,source="M:\identity\storagenode5\",destination=/app/identity --mount type=bind,source="M:\data\",destination=/app/config --mount type=bind,source=D:/Storj/Logs/node5.log,destination=/app/logs/node5.log --name storagenode5 storjlabs/storagenode:latest --log.output=/app/logs/node5.log

Alexey · May 10, 2022, 7:08am

A post was split to a new topic: ERROR piecestore download failed *.sj1: no such file or directory