Storagenode crashed, no traffic anymore

kbch · November 25, 2019, 8:07am

My Storagenode crashed ~24h ago. There shouldn’t be any data loss because the Container host and not Storage host crashed. It’s now running again since ~24h but I still just see piecestore upload started and also no traffic in the node dashboard. Is this normal? Does it just need some more time to recover again? Or is my node now disqualified?

https://pastebin.com/em4012rq

Odmin · November 25, 2019, 8:28am

Hi @kbch
Welcome to our forum!

The root cause of your problem:

Please try to restart your storagenode with docker restart storagenode and look into the log after restart.

kbch · November 25, 2019, 8:33am

Thanks for the fast answer. I restarted it now via docker but the errors came back again. Is there something I can do against them?:

https://pastebin.com/TJ3qc2uX

Odmin · November 25, 2019, 8:36am

Let’s waiting about 30-50min.

You can also look into this post
I think, at this moment your disk system has heavy load.

kbch · November 25, 2019, 3:11pm

Ok, looks like it now runs without errors. How long does it normally take in such a state? It’s just uploading without downloading anything.

https://pastebin.com/wqbC4YS5

Alexey · November 26, 2019, 9:21pm

Hello @kbch,
Welcome to the forum!

How is your HDD connected to the host?

kbch · November 26, 2019, 10:49pm

Hi @Alexey

It was NFS but now it is SATA. Currently I don’t see any errors anymore, but I also don’t see any traffic. I still just get tons of these piecestore upload. Not sure if this is normal.

Alexey · November 26, 2019, 10:58pm

This is why I asked, such behavior is typical for network-connected drives, especially - NFS. The lock file method is different for types of connection and filesystem.
What is filesystem on this disk?

kbch · November 26, 2019, 10:59pm

It’s btrfs. I think I will switch to ext4 over iSCSI as soon as it is running properly again.

Does this endless piecestore upload indicate a major unrecoverable fault in my node/database? Or is it more like the network has no traffic for me currently?

Alexey · November 26, 2019, 11:06pm

This is indicate that it can’t finish upload for some reason.
Usually it’s related to the troubles with locking sqlite database.
The btrfs could be a reason. It working fine on Synology, but not sure if all bugs are fixed for mainstream branch of btrfs

Please, check the permissions to the storage.

kbch · November 27, 2019, 10:13am

FS Permissions look ok. Is there a way to enable a sort of debugging mode or something like that to see why it can’t finish upload?

Odmin · November 27, 2019, 10:45am

I think you have two bad factors: NFS (have issues with DB’s) + BTRFS (on Linux also have issues, on Synology working fine).
I strogly reccomend avould this configuration, if possible, use only local storage that directly connected to the host. Only in sityation if local storage connection is not possible, you can use iSCSI (block level) protocol that not have issues like NFS, SMB (file level).
Also, in any case, I strongly reccomend replace btrfs (if you on linux) to ext4.

nerdatwork · November 27, 2019, 11:34am

Change log.level: info in your config.yaml file to log.level: debug and restart your node docker restart -t 300 storagenode

kbch · November 28, 2019, 8:50am

Ok, I was able to fix it. I moved the storagenode container from my NAS to a Computer and put the data onto a ext4 iSCSI drive. As soon as I did this change it started to generate traffic again. Sorry for the trouble

Odmin · November 28, 2019, 9:35am

You are welcome!
People on this forum always glad to help you