Audit errors (1.65.1)

Hello.
There were audit errors for the remote file.
Node version 1.65.1
Node more than 2 years
What could be the problem?

log below

# docker logs storagenode 2>&1 | grep IILX2LSM4K3LANJQBEN4OB273LHRW6B66T44EAMO5KF24QBDBXFA
2022-10-29T13:40:17.926Z        INFO    piecedeleter    delete piece sent to trash      {"Process": "storagenode", "Satellite ID": "12EayRS2V1kEsWESU9QMRseFhdxYxKicsiFmxrsLZHeLUtdps3S", "Piece ID": "IILX2LSM4K3LANJQBEN4OB273LHRW6B66T44EAMO5KF24QBDBXFA"}
2022-10-29T13:42:37.008Z        INFO    piecestore      download started        {"Process": "storagenode", "Piece ID": "IILX2LSM4K3LANJQBEN4OB273LHRW6B66T44EAMO5KF24QBDBXFA", "Satellite ID": "12EayRS2V1kEsWESU9QMRseFhdxYxKicsiFmxrsLZHeLUtdps3S", "Action": "GET_AUDIT"}
2022-10-29T13:42:37.008Z        ERROR   piecestore      download failed {"Process": "storagenode", "Piece ID": "IILX2LSM4K3LANJQBEN4OB273LHRW6B66T44EAMO5KF24QBDBXFA", "Satellite ID": "12EayRS2V1kEsWESU9QMRseFhdxYxKicsiFmxrsLZHeLUtdps3S", "Action": "GET_AUDIT", "error": "file does not exist", "errorVerbose": "file does not exist\n\tstorj.io/common/rpc/rpcstatus.Wrap:73\n\tstorj.io/storj/storagenode/piecestore.(*Endpoint).Download:554\n\tstorj.io/common/pb.DRPCPiecestoreDescription.Method.func2:228\n\tstorj.io/drpc/drpcmux.(*Mux).HandleRPC:33\n\tstorj.io/common/rpc/rpctracing.(*Handler).HandleRPC:61\n\tstorj.io/drpc/drpcserver.(*Server).handleRPC:122\n\tstorj.io/drpc/drpcserver.(*Server).ServeOne:66\n\tstorj.io/drpc/drpcserver.(*Server).Serve.func2:112\n\tstorj.io/drpc/drpcctx.(*Tracker).track:52"}

# docker logs storagenode 2>&1 | grep FWJP7P55BPDDAXQA4G65RFRT3JBEQ64XVUXANUX7E2VRSP66Z66A
2022-10-28T09:53:49.787Z        INFO    piecedeleter    delete piece sent to trash      {"Process": "storagenode", "Satellite ID": "12L9ZFwhzVpuEKMUNUqkaTLGzwY9G24tbiigLiXpmZWKwmcNDDs", "Piece ID": "FWJP7P55BPDDAXQA4G65RFRT3JBEQ64XVUXANUX7E2VRSP66Z66A"}
2022-10-28T09:59:13.542Z        INFO    piecestore      download started        {"Process": "storagenode", "Piece ID": "FWJP7P55BPDDAXQA4G65RFRT3JBEQ64XVUXANUX7E2VRSP66Z66A", "Satellite ID": "12L9ZFwhzVpuEKMUNUqkaTLGzwY9G24tbiigLiXpmZWKwmcNDDs", "Action": "GET_AUDIT"}
2022-10-28T09:59:13.543Z        ERROR   piecestore      download failed {"Process": "storagenode", "Piece ID": "FWJP7P55BPDDAXQA4G65RFRT3JBEQ64XVUXANUX7E2VRSP66Z66A", "Satellite ID": "12L9ZFwhzVpuEKMUNUqkaTLGzwY9G24tbiigLiXpmZWKwmcNDDs", "Action": "GET_AUDIT", "error": "file does not exist", "errorVerbose": "file does not exist\n\tstorj.io/common/rpc/rpcstatus.Wrap:73\n\tstorj.io/storj/storagenode/piecestore.(*Endpoint).Download:554\n\tstorj.io/common/pb.DRPCPiecestoreDescription.Method.func2:228\n\tstorj.io/drpc/drpcmux.(*Mux).HandleRPC:33\n\tstorj.io/common/rpc/rpctracing.(*Handler).HandleRPC:61\n\tstorj.io/drpc/drpcserver.(*Server).handleRPC:122\n\tstorj.io/drpc/drpcserver.(*Server).ServeOne:66\n\tstorj.io/drpc/drpcserver.(*Server).Serve.func2:112\n\tstorj.io/drpc/drpcctx.(*Tracker).track:52"}

# docker logs storagenode 2>&1 | grep HRB2JAEWEQFH732GJWFCT2VGXWXUH7JSRP4AXM2KNZ4RMX74DHXQ
2022-10-27T03:04:10.510Z        INFO    piecedeleter    delete piece sent to trash      {"Process": "storagenode", "Satellite ID": "12EayRS2V1kEsWESU9QMRseFhdxYxKicsiFmxrsLZHeLUtdps3S", "Piece ID": "HRB2JAEWEQFH732GJWFCT2VGXWXUH7JSRP4AXM2KNZ4RMX74DHXQ"}
2022-10-27T03:11:06.751Z        INFO    piecestore      download started        {"Process": "storagenode", "Piece ID": "HRB2JAEWEQFH732GJWFCT2VGXWXUH7JSRP4AXM2KNZ4RMX74DHXQ", "Satellite ID": "12EayRS2V1kEsWESU9QMRseFhdxYxKicsiFmxrsLZHeLUtdps3S", "Action": "GET_AUDIT"}
2022-10-27T03:11:06.752Z        ERROR   piecestore      download failed {"Process": "storagenode", "Piece ID": "HRB2JAEWEQFH732GJWFCT2VGXWXUH7JSRP4AXM2KNZ4RMX74DHXQ", "Satellite ID": "12EayRS2V1kEsWESU9QMRseFhdxYxKicsiFmxrsLZHeLUtdps3S", "Action": "GET_AUDIT", "error": "file does not exist", "errorVerbose": "file does not exist\n\tstorj.io/common/rpc/rpcstatus.Wrap:73\n\tstorj.io/storj/storagenode/piecestore.(*Endpoint).Download:554\n\tstorj.io/common/pb.DRPCPiecestoreDescription.Method.func2:228\n\tstorj.io/drpc/drpcmux.(*Mux).HandleRPC:33\n\tstorj.io/common/rpc/rpctracing.(*Handler).HandleRPC:61\n\tstorj.io/drpc/drpcserver.(*Server).handleRPC:122\n\tstorj.io/drpc/drpcserver.(*Server).ServeOne:66\n\tstorj.io/drpc/drpcserver.(*Server).Serve.func2:112\n\tstorj.io/drpc/drpcctx.(*Tracker).track:52"}

this means, that you lost this file that they try to download.

The logs show that the file was moved to the trash (piecedeleter delete piece sent to trash), and after a few minutes they try to download it.
I think there is some mistake here.

Some days ago they were doing some tests, see here (Several "file does not exist" on different nodes)). but on that occasion all the errors were from a single satellite

is your audit score is still 100%?

What is your nodeID?

1amGrvxdmGoUyG2r8pRCUWtNxuKzL3qRHzseTUrzTko3iWaQGb

This is most likely due to the testing referenced above (which is still ongoing, and happening on multiple satellites at once). If so, your audit score will not be affected.

But, good news, even if this was a ā€˜realā€™ audit (suggesting that the audit started before the deletion, but the deletion finished first), your audit score would also be unaffected. If a segment is deleted at any point before an audit completes, the result of the audit is thrown away.

As to why the deletions seem to have happened so soon before the GET requests: I think that this is likely a coincidence, especially if you have a large number of pieces, and these were the only instances found.

2 Likes

Thanks for the information, I will watch the audit and these files.

Good to know, thanks.

A propos of this tests, i suspect those audits are not been accounted correctly, judging only by the sheer amount of events

From the logs and counting since the beginning of the month i have:

297953 GET
314415 GET_AUDIT
162599 GET_REPAIR

Screenshot-Dashboard

So the bandwidth for Repair&Audit is more or less half than Download but the number of events is almost 70% more, how that can be?

I can dig more if needed.

Audits donā€™t download the entire piece. Just a single stripe from that piece. The amount of data is tiny, so you never get any real income from audits.

My calculator shows them separate. Only 12MB for this entire month, leading to 0 cents of earnings.
image

This calculator is based on the nodes local stats, so it would include any audit, including these test ones.

2 Likes

OK, i didnā€™t now that, thank you
now everything make sense

Omgā€¦ You are still testing in production?! How dare you???

Pleased see Several "file does not exist" on different nodes

The GET_AUDIT calls in the test are only downloading a small portion of the requested piece, if they download anything at all. For the most part, we are looking to see whether the pieces from a certain class of segment are more or less likely to be lost than other pieces. We expected from the beginning that some of the pieces we are asking for will not exist, even on well-behaved nodes, so that is why this test does not affect reputation scores.

The point is well taken, and be assured we strongly avoid testing new code in production. The ongoing test I mentioned is more about determining the emergent dynamics of the existing system, not a test of new unproven code (which is usually what people mean by ā€œdonā€™t test in productionā€).

6 Likes