Storage Node hangs and does not respond

Hello,
My v3 storage node was built using raspberry pi. Yesterday, I lost power to pi and after restoring power, the system hangs and does not do anything. I do not know which process is hanging up. I cannot even run top command. What should I do at this point? Should I rebuild the node? If I need to rebuild, where can I get instructions to rebuild?

Thank you.

Don’t do anything until you get a helpful reply here or you can email support at support@storj.io to get an official response in a timely fashion. Good luck!

Do you able to login to your pi3?

If not, then likely your SD card is got corrupted. Then you should power off the pi3, eject the card and try to backup your identity from it on the PC.
If you have a backup of your identity, then just flash the OS to the card and boot again.
Do not format and do not partitioning your HDD! Just mount it via /etc/fstab.
Then you can install the docker and run a storagenode as usual.

Thank you Alexey for your reply.

I can login to the pi, but some process is taking 100% CPU and I cannot do anything after logging in. I have identity backed up when I was creating the node. I will flash the OS and try going through setup again. Thank you.

I think I know the reason -

The solution is

docker ps | grep watchtower | awk '{hs = hs" "$1} END{ print "docker rm --force", hs}' | bash
1 Like

I did not see this message before flashing the drive. I am setting up fresh now. I will copy over the identities. Please let me know when I can find v3 network setup details. I will also setup watchtower after that.

I found the link for setup. Working on it. I am facing a disk mounting issue. I will explore the issue.

Okay Alexey. I rebuilt the whole thing. how do I check that my node is health? You gave a command to check the dashboard, but I do not remember that command. Could you share that with me again? Is there anyway to check the status of my node including ranking?

Thank you.

This One?

docker exec -it storagenode /app/dashboard.sh

Here you can check your log

docker logs --tail 50 storagenode

If you added -p 127.0.0.1:14002:14002 you can browse to that and find some more info per sattelite.

There is no way so far to see ranking

Correct SNOboard address is 127.0.0.1:14002 and not 127.0.0.1:14002:14002 :slight_smile:

1 Like

So true. Should been more clear🙂 that i was reffering to run command and browseable at 127.0.0.1:14002

Thank you all for replies. Dashboard output says that node is online. I see some ingress and egress bandwidth. However, I see lot of errors in the docker log file related to audit.

“Action”: “GET_AUDIT”, “error”: “rpc error: code = NotFound desc = file does not exist”}
2019-10-06T16:10:11.567Z ERROR server gRPC stream error response {“error”: “rpc error: code = NotFound desc = file does not exist”}

I am not sure if these are indicating any issues with my node. Please let me know.

They are. Either you are missing data or the node for some reason can’t read it. There were some issues with the satellite auditing pieces that were deleted or never delivered, but that should be sporadic. If you’re seeing a lot of those errors there definitely is an issue with your node.

1 Like