Storagenode crashed, no traffic anymore

My Storagenode crashed ~24h ago. There shouldn’t be any data loss because the Container host and not Storage host crashed. It’s now running again since ~24h but I still just see piecestore upload started and also no traffic in the node dashboard. Is this normal? Does it just need some more time to recover again? Or is my node now disqualified?

https://pastebin.com/em4012rq

Hi @kbch
Welcome to our forum!

The root cause of your problem:

Please try to restart your storagenode with docker restart storagenode and look into the log after restart.

1 Like

Thanks for the fast answer. I restarted it now via docker but the errors came back again. Is there something I can do against them?:

https://pastebin.com/TJ3qc2uX

Let’s waiting about 30-50min.

You can also look into this post
I think, at this moment your disk system has heavy load.

1 Like

Ok, looks like it now runs without errors. How long does it normally take in such a state? It’s just uploading without downloading anything.

https://pastebin.com/wqbC4YS5

Hello @kbch,
Welcome to the forum!

How is your HDD connected to the host?

Hi @Alexey

It was NFS but now it is SATA. Currently I don’t see any errors anymore, but I also don’t see any traffic. I still just get tons of these piecestore upload. Not sure if this is normal.

This is why I asked, such behavior is typical for network-connected drives, especially - NFS. The lock file method is different for types of connection and filesystem.
What is filesystem on this disk?

It’s btrfs. I think I will switch to ext4 over iSCSI as soon as it is running properly again.

Does this endless piecestore upload indicate a major unrecoverable fault in my node/database? Or is it more like the network has no traffic for me currently?

This is indicate that it can’t finish upload for some reason.
Usually it’s related to the troubles with locking sqlite database.
The btrfs could be a reason. It working fine on Synology, but not sure if all bugs are fixed for mainstream branch of btrfs

Please, check the permissions to the storage.

FS Permissions look ok. Is there a way to enable a sort of debugging mode or something like that to see why it can’t finish upload?

I think you have two bad factors: NFS (have issues with DB’s) + BTRFS (on Linux also have issues, on Synology working fine).
I strogly reccomend avould this configuration, if possible, use only local storage that directly connected to the host. Only in sityation if local storage connection is not possible, you can use iSCSI (block level) protocol that not have issues like NFS, SMB (file level).
Also, in any case, I strongly reccomend replace btrfs (if you on linux) to ext4.

Change log.level: info in your config.yaml file to log.level: debug and restart your node docker restart -t 300 storagenode

Ok, I was able to fix it. I moved the storagenode container from my NAS to a Computer and put the data onto a ext4 iSCSI drive. As soon as I did this change it started to generate traffic again. Sorry for the trouble :frowning:

2 Likes

You are welcome!
People on this forum always glad to help you :+1: