I do notice that the VM console indicates a number of OOM events, but this VM had 8GB of RAM. Don’t people run STORJ nodes on raspberry pi’s? How is it possible 8GB wasn’t enough RAM?
It is\was a Debian 10 VM running STORJ in Docker.
Additionally, if my node is disqualified why is traffic still flowing? I’ve been at 2Mbps inbound almost all day, and it hasn’t let up.
I found that disk bad blocks can lock up the memory in linux. Did you check the output of dmesg for media error, use smartctl to check for pending sectors etc?
If your storage has a high latency or damaged. All network attached storage have a high latency. The NFS and SMB are not compatible with storagenode, the only working network protocol is iSCSI, but still have a high latency and can drop connections or even lose files without proper infrastructure.
How is your disk connected?
Please, check failed audits:
This may help explain what might be the problem here:
It’s possible that running a storage node in a VM creates a caching problem due to the very high I/O requirements. The VM node than drops data pieces and becomes DQ-ed.
The network storage is healthy, but yes as this is a VM running on my hypervisor I use NFS as a storage backend for the large file storage. This is accomplished over an uncongested local 10Gbps network.
I’ve never before had someone tell me that NFS is “high latency”, but it makes sense I guess. I have never had a problem with NFS backend even for data intensive uses like media transcoding\streaming and torrenting, but I can concede that it could be possible.
I’m kind of shocked that something so latency sensitive works well on a Raspberry Pi with a USB hard drive. I don’t see how that could possibly be better performing than my NFS shares.
I can try moving it to iSCSI, I do have that and use it for some local storage purposes, but mostly on Windows systems for the reason that NFS permissions on Windows are obnoxious and iSCSI is easier in that environment. Never occurred to me that the iSCSI performed that significantly better.
If I’m able to get the audit situation under control, could I potentially become reauthorized in the future on the satellites that bumped me, since I’m still active on some satellites? Should I consider gracefully exiting this node and starting another one?
The storagenode is incompatible with NFS/SMB, as I said earlier. It’s not related to the latency, but to Linux implementation of SMB and specifically to NFS. And it’s proven as a way to disqualification or other problems: Topics tagged nfs
Please, do not use any network attached storage, especially NFS or SMB and especially on Linux (for example, they have a not fully compatible SMB protocol).
Though the SMB could work in some circumstances (Windows server - Windows client, or local connection via CIFS/SMB), but the remote connected storage should be avoided as well.
No. The disqualification is permanent. But you can start a new node on the other disk (please, do not use network connected storage!).
Also, you can vote for the idea