Node Disqualified - Explanation?

My relatively new STORJ node has been disqualified today, of course no reason given (hopefully this is something that can be improved in the future?)

I found that my audit success percentage is below 60% which is probably why I’ve been disqualified, but what could be the reason for this?

I do notice that the VM console indicates a number of OOM events, but this VM had 8GB of RAM. Don’t people run STORJ nodes on raspberry pi’s? How is it possible 8GB wasn’t enough RAM?

It is\was a Debian 10 VM running STORJ in Docker.

Additionally, if my node is disqualified why is traffic still flowing? I’ve been at 2Mbps inbound almost all day, and it hasn’t let up.

Hi @kirk137 - welcome to the Storj community.

It would be helpful to post your storagenode logs and see what they indicate.

https://documentation.storj.io/resources/faq/check-logs

I found that disk bad blocks can lock up the memory in linux. Did you check the output of dmesg for media error, use smartctl to check for pending sectors etc?

I would guess that you were not disqualified on all satellites.

If your storage has a high latency or damaged. All network attached storage have a high latency. The NFS and SMB are not compatible with storagenode, the only working network protocol is iSCSI, but still have a high latency and can drop connections or even lose files without proper infrastructure.

How is your disk connected?
Please, check failed audits:

docker logs storagenode 2>&1 | grep GET_AUDIT | grep failed

This may help explain what might be the problem here:

https://www.linuxquestions.org/questions/linux-general-1/command-to-check-process-taking-high-cached-memory-927756/


It’s possible that running a storage node in a VM creates a caching problem due to the very high I/O requirements. The VM node than drops data pieces and becomes DQ-ed.

Ahh yes, I am in fact still active on other satellites. I’m in US Central and thought that was “my” satellite, but I still see others active.

Here are some logs:

2020-06-14T09:38:00.580Z ERROR piecestore download failed {“Piece ID”: “WKCJ6A45RU2SPX7HXKCFOFRUYGUDHQ546AAWZT3OS6EVAGLHFANQ”, “Satellite ID”: “12EayRS2V1kEsWESU9QMRseFhdxYxKicsiFmxrsLZHeLUtdps3S”, “Action”: “GET_AUDIT”, “error”: “usedserialsdb error: context canceled”, “errorVerbose”: “usedserialsdb error: context canceled\n\tstorj.io/storj/storagenode/storagenodedb.(*usedSerialsDB).Add:35\n\tstorj.io/storj/storagenode/piecestore.(*Endpoint).verifyOrderLimit:76\n\tstorj.io/storj/storagenode/piecestore.(*Endpoint).Download:459\n\tstorj.io/common/pb.DRPCPiecestoreDescription.Method.func2:1004\n\tstorj.io/drpc/drpcmux.(*Mux).HandleRPC:107\n\tstorj.io/common/rpc/rpctracing.(*Handler).HandleRPC:56\n\tstorj.io/drpc/drpcserver.(*Server).handleRPC:111\n\tstorj.io/drpc/drpcserver.(*Server).ServeOne:62\n\tstorj.io/drpc/drpcserver.(*Server).Serve.func2:99\n\tstorj.io/drpc/drpcctx.(*Tracker).track:51”}
2020-06-14T09:43:46.723Z ERROR piecestore download failed {“Piece ID”: “WJE3KE2FJ4IMWO2ZPRJBL6IKFVR6QBC3BGH2LNDYI4YPI6HA3GGQ”, “Satellite ID”: “118UWpMCHzs6CvSgWd9BfFVjw5K9pZbJjkfZJexMtSkmKxvvAW”, “Action”: “GET_AUDIT”, “error”: “usedserialsdb error: context canceled”, “errorVerbose”: “usedserialsdb error: context canceled\n\tstorj.io/storj/storagenode/storagenodedb.(*usedSerialsDB).Add:35\n\tstorj.io/storj/storagenode/piecestore.(*Endpoint).verifyOrderLimit:76\n\tstorj.io/storj/storagenode/piecestore.(*Endpoint).Download:459\n\tstorj.io/common/pb.DRPCPiecestoreDescription.Method.func2:1004\n\tstorj.io/drpc/drpcmux.(*Mux).HandleRPC:107\n\tstorj.io/common/rpc/rpctracing.(*Handler).HandleRPC:56\n\tstorj.io/drpc/drpcserver.(*Server).handleRPC:111\n\tstorj.io/drpc/drpcserver.(*Server).ServeOne:62\n\tstorj.io/drpc/drpcserver.(*Server).Serve.func2:99\n\tstorj.io/drpc/drpcctx.(*Tracker).track:51”}
2020-06-14T10:44:20.773Z ERROR piecestore download failed {“Piece ID”: “XESD2VLJ2T26JJT4DZKZMYFR4T45ROWL3366LTQG7BB5E56UCL5Q”, “Satellite ID”: “121RTSDpyNZVcEU84Ticf2L1ntiuUimbWgfATz21tuvgk3vzoA6”, “Action”: “GET_AUDIT”, “error”: “usedserialsdb error: context canceled”, “errorVerbose”: “usedserialsdb error: context canceled\n\tstorj.io/storj/storagenode/storagenodedb.(*usedSerialsDB).Add:35\n\tstorj.io/storj/storagenode/piecestore.(*Endpoint).verifyOrderLimit:76\n\tstorj.io/storj/storagenode/piecestore.(*Endpoint).Download:459\n\tstorj.io/common/pb.DRPCPiecestoreDescription.Method.func2:1004\n\tstorj.io/drpc/drpcmux.(*Mux).HandleRPC:107\n\tstorj.io/common/rpc/rpctracing.(*Handler).HandleRPC:56\n\tstorj.io/drpc/drpcserver.(*Server).handleRPC:111\n\tstorj.io/drpc/drpcserver.(*Server).ServeOne:62\n\tstorj.io/drpc/drpcserver.(*Server).Serve.func2:99\n\tstorj.io/drpc/drpcctx.(*Tracker).track:51”}
2020-06-14T14:54:41.501Z ERROR piecestore download failed {“Piece ID”: “XESD2VLJ2T26JJT4DZKZMYFR4T45ROWL3366LTQG7BB5E56UCL5Q”, “Satellite ID”: “121RTSDpyNZVcEU84Ticf2L1ntiuUimbWgfATz21tuvgk3vzoA6”, “Action”: “GET_AUDIT”, “error”: “usedserialsdb error: context canceled”, “errorVerbose”: “usedserialsdb error: context canceled\n\tstorj.io/storj/storagenode/storagenodedb.(*usedSerialsDB).Add:35\n\tstorj.io/storj/storagenode/piecestore.(*Endpoint).verifyOrderLimit:76\n\tstorj.io/storj/storagenode/piecestore.(*Endpoint).Download:459\n\tstorj.io/common/pb.DRPCPiecestoreDescription.Method.func2:1004\n\tstorj.io/drpc/drpcmux.(*Mux).HandleRPC:107\n\tstorj.io/common/rpc/rpctracing.(*Handler).HandleRPC:56\n\tstorj.io/drpc/drpcserver.(*Server).handleRPC:111\n\tstorj.io/drpc/drpcserver.(*Server).ServeOne:62\n\tstorj.io/drpc/drpcserver.(*Server).Serve.func2:99\n\tstorj.io/drpc/drpcctx.(*Tracker).track:51”}
2020-06-14T14:55:10.088Z ERROR piecestore download failed {“Piece ID”: “BLPKVTIOF7LSGWTUSSN2DOVYMWGXYCXLSQMPYBXRYFMBCK2JJTGA”, “Satellite ID”: “1wFTAgs9DP5RSnCqKV1eLf6N9wtk4EAtmN5DpSxcs8EjT69tGE”, “Action”: “GET_AUDIT”, “error”: “usedserialsdb error: context canceled”, “errorVerbose”: “usedserialsdb error: context canceled\n\tstorj.io/storj/storagenode/storagenodedb.(*usedSerialsDB).Add:35\n\tstorj.io/storj/storagenode/piecestore.(*Endpoint).verifyOrderLimit:76\n\tstorj.io/storj/storagenode/piecestore.(*Endpoint).Download:459\n\tstorj.io/common/pb.DRPCPiecestoreDescription.Method.func2:1004\n\tstorj.io/drpc/drpcmux.(*Mux).HandleRPC:107\n\tstorj.io/common/rpc/rpctracing.(*Handler).HandleRPC:56\n\tstorj.io/drpc/drpcserver.(*Server).handleRPC:111\n\tstorj.io/drpc/drpcserver.(*Server).ServeOne:62\n\tstorj.io/drpc/drpcserver.(*Server).Serve.func2:99\n\tstorj.io/drpc/drpcctx.(*Tracker).track:51”}
2020-06-14T16:28:12.522Z ERROR piecestore download failed {“Piece ID”: “XESD2VLJ2T26JJT4DZKZMYFR4T45ROWL3366LTQG7BB5E56UCL5Q”, “Satellite ID”: “121RTSDpyNZVcEU84Ticf2L1ntiuUimbWgfATz21tuvgk3vzoA6”, “Action”: “GET_AUDIT”, “error”: “usedserialsdb error: context canceled”, “errorVerbose”: “usedserialsdb error: context canceled\n\tstorj.io/storj/storagenode/storagenodedb.(*usedSerialsDB).Add:35\n\tstorj.io/storj/storagenode/piecestore.(*Endpoint).verifyOrderLimit:76\n\tstorj.io/storj/storagenode/piecestore.(*Endpoint).Download:459\n\tstorj.io/common/pb.DRPCPiecestoreDescription.Method.func2:1004\n\tstorj.io/drpc/drpcmux.(*Mux).HandleRPC:107\n\tstorj.io/common/rpc/rpctracing.(*Handler).HandleRPC:56\n\tstorj.io/drpc/drpcserver.(*Server).handleRPC:111\n\tstorj.io/drpc/drpcserver.(*Server).ServeOne:62\n\tstorj.io/drpc/drpcserver.(*Server).Serve.func2:99\n\tstorj.io/drpc/drpcctx.(*Tracker).track:51”}
2020-06-14T16:57:37.565Z ERROR piecestore download failed {“Piece ID”: “IIW6M2YYYLCNYS3D6BNYPVCFPS2N72DQFDLTJP3DKOOAQSKJLYZA”, “Satellite ID”: “118UWpMCHzs6CvSgWd9BfFVjw5K9pZbJjkfZJexMtSkmKxvvAW”, “Action”: “GET_AUDIT”, “error”: “usedserialsdb error: context canceled”, “errorVerbose”: “usedserialsdb error: context canceled\n\tstorj.io/storj/storagenode/storagenodedb.(*usedSerialsDB).Add:35\n\tstorj.io/storj/storagenode/piecestore.(*Endpoint).verifyOrderLimit:76\n\tstorj.io/storj/storagenode/piecestore.(*Endpoint).Download:459\n\tstorj.io/common/pb.DRPCPiecestoreDescription.Method.func2:1004\n\tstorj.io/drpc/drpcmux.(*Mux).HandleRPC:107\n\tstorj.io/common/rpc/rpctracing.(*Handler).HandleRPC:56\n\tstorj.io/drpc/drpcserver.(*Server).handleRPC:111\n\tstorj.io/drpc/drpcserver.(*Server).ServeOne:62\n\tstorj.io/drpc/drpcserver.(*Server).Serve.func2:99\n\tstorj.io/drpc/drpcctx.(*Tracker).track:51”}

The network storage is healthy, but yes as this is a VM running on my hypervisor I use NFS as a storage backend for the large file storage. This is accomplished over an uncongested local 10Gbps network.
I’ve never before had someone tell me that NFS is “high latency”, but it makes sense I guess. I have never had a problem with NFS backend even for data intensive uses like media transcoding\streaming and torrenting, but I can concede that it could be possible.

I’m kind of shocked that something so latency sensitive works well on a Raspberry Pi with a USB hard drive. I don’t see how that could possibly be better performing than my NFS shares.

I can try moving it to iSCSI, I do have that and use it for some local storage purposes, but mostly on Windows systems for the reason that NFS permissions on Windows are obnoxious and iSCSI is easier in that environment. Never occurred to me that the iSCSI performed that significantly better.

If I’m able to get the audit situation under control, could I potentially become reauthorized in the future on the satellites that bumped me, since I’m still active on some satellites? Should I consider gracefully exiting this node and starting another one?

This error could only suspend your node, but not disqualify.
So, please try this one:

docker logs storagenode 2>&1 | grep GET_AUDIT | grep failed | grep -v usedserialsdb

The storagenode is incompatible with NFS/SMB, as I said earlier. It’s not related to the latency, but to Linux implementation of SMB and specifically to NFS. And it’s proven as a way to disqualification or other problems: https://forum.storj.io/tag/nfs

The iSCSI performs MUCH better: https://forum.storj.io/search?q=iSCSI
However, it has issues as well: https://forum.storj.io/tag/iscsi

Please, do not use any network attached storage, especially NFS or SMB and especially on Linux (for example, they have a not fully compatible SMB protocol).
Though the SMB could work in some circumstances (Windows server - Windows client, or local connection via CIFS/SMB), but the remote connected storage should be avoided as well.

No. The disqualification is permanent. But you can start a new node on the other disk (please, do not use network connected storage!).
Also, you can vote for the idea