I do a little google and found this script (GitHub - ReneSmeekes/storj_success_rate) and here is the result
It seem like my node fail 1 audit, after investigate my network and my storj docker, i found that it my fault to set auto restart timer for my router.
So my question is does my node audit recover? or it just stay 99.14% on eu1 sattlite.
Many Thanks
The audit score can be affected only when your node is online, responds on audit requests but either did not provide a required piece for audit or provided a corrupted one.
Since the script detected the audit failure error, you may check - what kind of error:
If the error is “file not found”, then your node is lost that piece, this is unrecoverable. If it were a temporary issue (like disconnected disk or wrong file permissions), then it may pass the audit of this piece next time.
The audit score can recover during the time, if your do not have any other pieces corrupted/missed, otherwise your node will be disqualified if it would lost 5% of data.
An update from my situation, yesterday i receive 2 more fail audit, here are some of my logs
2022-10-21T12:25:29.029+0700 ERROR piecestore download failed {"Process": "storagenode", "Piece ID": "3Q7P5Y6BQHR4C3ZCG327CCI7BUF2T4C24RNZUTAXTMCZQHWYGBSA", "Satellite ID": "12EayRS2V1kEsWESU9QMRseFhdxYxKicsiFmxrsLZHeLUtdps3S", "Action": "GET_AUDIT", "error": "file does not exist", "errorVerbose": "file does not exist\n\tstorj.io/common/rpc/rpcstatus.Wrap:73\n\tstorj.io/storj/storagenode/piecestore.(*Endpoint).Download:554\n\tstorj.io/common/pb.DRPCPiecestoreDescription.Method.func2:228\n\tstorj.io/drpc/drpcmux.(*Mux).HandleRPC:33\n\tstorj.io/common/rpc/rpctracing.(*Handler).HandleRPC:61\n\tstorj.io/drpc/drpcserver.(*Server).handleRPC:122\n\tstorj.io/drpc/drpcserver.(*Server).ServeOne:66\n\tstorj.io/drpc/drpcserver.(*Server).Serve.func2:112\n\tstorj.io/drpc/drpcctx.(*Tracker).track:52"}
2022-10-22T13:06:01.859+0700 ERROR piecestore download failed {"Process": "storagenode", "Piece ID": "CXZPP3HJ7NH3UIH2UMT37CZBQPH5CWS75HJAXVHXFBFY76JIU4DA", "Satellite ID": "12EayRS2V1kEsWESU9QMRseFhdxYxKicsiFmxrsLZHeLUtdps3S", "Action": "GET_AUDIT", "error": "file does not exist", "errorVerbose": "file does not exist\n\tstorj.io/common/rpc/rpcstatus.Wrap:73\n\tstorj.io/storj/storagenode/piecestore.(*Endpoint).Download:554\n\tstorj.io/common/pb.DRPCPiecestoreDescription.Method.func2:228\n\tstorj.io/drpc/drpcmux.(*Mux).HandleRPC:33\n\tstorj.io/common/rpc/rpctracing.(*Handler).HandleRPC:61\n\tstorj.io/drpc/drpcserver.(*Server).handleRPC:122\n\tstorj.io/drpc/drpcserver.(*Server).ServeOne:66\n\tstorj.io/drpc/drpcserver.(*Server).Serve.func2:112\n\tstorj.io/drpc/drpcctx.(*Tracker).track:52"}
2022-10-22T13:34:35.440+0700 ERROR piecestore download failed {"Process": "storagenode", "Piece ID": "74MWNAVUNGIM7P5MRYA5DBM37KVIX6AW46ZCEVPXURRJ32L627TA", "Satellite ID": "12EayRS2V1kEsWESU9QMRseFhdxYxKicsiFmxrsLZHeLUtdps3S", "Action": "GET_AUDIT", "error": "file does not exist", "errorVerbose": "file does not exist\n\tstorj.io/common/rpc/rpcstatus.Wrap:73\n\tstorj.io/storj/storagenode/piecestore.(*Endpoint).Download:554\n\tstorj.io/common/pb.DRPCPiecestoreDescription.Method.func2:228\n\tstorj.io/drpc/drpcmux.(*Mux).HandleRPC:33\n\tstorj.io/common/rpc/rpctracing.(*Handler).HandleRPC:61\n\tstorj.io/drpc/drpcserver.(*Server).handleRPC:122\n\tstorj.io/drpc/drpcserver.(*Server).ServeOne:66\n\tstorj.io/drpc/drpcserver.(*Server).Serve.func2:112\n\tstorj.io/drpc/drpcctx.(*Tracker).track:52"}
I’m running Storj-node container on a Truenas Scale system, I alocate 4.5Tb for storj and my system has 32gb of Ram, HDD i’m using is Exos 7E8 they use magnetic recording (CMR ) technology so theory it should have optimal performance.
I check the status of my zpool and here is result
root@truenas[~]# zpool status -v
pool: Databoiz
state: ONLINE
scan: resilvered 9.19M in 00:00:03 with 0 errors on Sat Sep 24 03:10:22 2022
Databoiz ONLINE 0 0 0
raidz1-0 ONLINE 0 0 0
03b1e2f3-96ec-4a7a-ad93-0a7b828e30ae ONLINE 0 0 0
3cbb10a6-fcc2-41d5-a2f6-272c2752ce13 ONLINE 0 0 0
35ecdbce-1a3e-4716-837d-9949017a5429 ONLINE 0 0 0
279a64ae-f165-4ecd-8ced-cea9cb35c155 ONLINE 0 0 0
efb9e5f6-ff0b-4d73-942f-36fa5fa21d5b ONLINE 0 0 0
errors: No known data errors
Currently i’m runing a scrub task to check my pool integrity i will update late.
And i found 2 of this command to cut down when i googleing :v
zfs set xattr=off poolname
zfs set atime=off poolname
@Alexey can you suggest for me what should I do next ?
2022-10-21T12:25:29.024+0700 INFO piecestore download started {"Process": "storagenode", "Piece ID": "3Q7P5Y6BQHR4C3ZCG327CCI7BUF2T4C24RNZUTAXTMCZQHWYGBSA", "Satellite ID": "12EayRS2V1kEsWESU9QMRseFhdxYxKicsiFmxrsLZHeLUtdps3S", "Action": "GET_AUDIT"}
2022-10-21T12:25:29.029+0700 ERROR piecestore download failed {"Process": "storagenode", "Piece ID": "3Q7P5Y6BQHR4C3ZCG327CCI7BUF2T4C24RNZUTAXTMCZQHWYGBSA", "Satellite ID": "12EayRS2V1kEsWESU9QMRseFhdxYxKicsiFmxrsLZHeLUtdps3S", "Action": "GET_AUDIT", "error": "file does not exist", "errorVerbose": "file does not exist\n\tstorj.io/common/rpc/rpcstatus.Wrap:73\n\tstorj.io/storj/storagenode/piecestore.(*Endpoint).Download:554\n\tstorj.io/common/pb.DRPCPiecestoreDescription.Method.func2:228\n\tstorj.io/drpc/drpcmux.(*Mux).HandleRPC:33\n\tstorj.io/common/rpc/rpctracing.(*Handler).HandleRPC:61\n\tstorj.io/drpc/drpcserver.(*Server).handleRPC:122\n\tstorj.io/drpc/drpcserver.(*Server).ServeOne:66\n\tstorj.io/drpc/drpcserver.(*Server).Serve.func2:112\n\tstorj.io/drpc/drpcctx.(*Tracker).track:52"}
It seem like when my node start download and auditing that peice at the same time :v
Also today disqualify % of my node is droping, yesterday it was 2.02%
One thing I’ve noticed in the past is that during periods of instability of my ISP connection, where the connectivity is going up and down (i.e. only about 50% of successful pings in a ping test), then it’s possible for a node to receive the audit request during an “up” period, but then if the node goes to send the data during a “down” period, it can result in failing the audit. So generally, if I notice that my ISP connection has gone into a situation like that, I’ll shutdown my nodes until the connection is fully re-established to avoid impacting my audit score.
Not sure what kind of networking gear you have, but if there’s a way to detect ISP connection instability, then that could be something to look at for investigating your current situation, but at the same time something to consider in the future.