Audit scores dropping on ap1; Storj is working on a fix, SNO's don't need to do anything. ERROR piecedeleter could not send delete piece to trash >> Pieces error: v0pieceinfodb: sql: no rows in result set

Spoke to soon… the problem persists.
during the last 4 hours i got another 2 failures on trying to track them down…

2 posts were merged into an existing topic: Audit weirdness

The failed deletions seem to have stopped and I don’t seem to be getting audited on them anymore, however I’m still failing repair traffic for these deleted pieces (which I guess is sort of correct as they are deleted). I guess that’s the next part of the fixup.

2021-07-24T09:24:03.165+0100 ERROR piecestore download failed {"Piece ID": "ZY6QV3GFEUZBTNSFYF6MWWHNMUXGVV6VDRPU6R2I66DMMEK6XJWQ", "Satellite ID": "121RTSDpyNZVcEU84Ticf2L1ntiuUimbWgfATz21tuvgk3vzoA6", "Action": "GET_REPAIR", "error": "file does not exist", "errorVerbose": "file does not exist\n\tstorj.io/common/rpc/rpcstatus.Wrap:73\n\tstorj.io/storj/storagenode/piecestore.(*Endpoint).Download:534\n\tstorj.io/common/pb.DRPCPiecestoreDescription.Method.func2:217\n\tstorj.io/drpc/drpcmux.(*Mux).HandleRPC:33\n\tstorj.io/common/rpc/rpctracing.(*Handler).HandleRPC:58\n\tstorj.io/drpc/drpcserver.(*Server).handleRPC:102\n\tstorj.io/drpc/drpcserver.(*Server).ServeOne:60\n\tstorj.io/drpc/drpcserver.(*Server).Serve.func2:95\n\tstorj.io/drpc/drpcctx.(*Tracker).track:51"}

it does seem to be running a lot better… but still seeing issues, even if very rarely… odd thing is that i cannot find it in my logs… i can only assume either my logs or my dashboard is wrong.

been looking everywhere can i just cannot find the failed audit for ap1, even tho it just dropped on in audit score on the dashboard…

even checking my log consistency i am recording every minut… and since it exports on a 10minute schedule… and docker will save the last 10MB worth of logs…

but yeah repairs for today sure doesn’t look to be doing to well
usually that is 99.5% or better

========== REPAIR DOWNLOAD ====
Failed:                4030
Fail Rate:             8.720%
Canceled:              0
Cancel Rate:           0.000%
Successful:            42188
Success Rate:          91.280%

us2 is having a different issue, discussed here. Audit weirdness (us2 satellite). (Team aware. No additional information is needed)

But please verify by looking at the piece history. If it’s the same as mentioned there, you can post logs in the appropriate topic. If you’re seeing a different issue, please start a new topic. Let’s try and keep this one focused on the issue on ap1. (Also looking at you @SGC :wink: )

1 Like

Moved out to Audit weirdness

1 Like

No. See Warning and error logs since updating to latest releasse

I have the same for ap1 sattelite now

1 Like

Yep - AP1 hit for me too.

The AP1 failures are related to a different issue.

I seem to be failing audits on ap1.storj.io as it’s auditing pieces it’s already deleted.
This is a satellite issue not a storagenode issue.

1 Like

seems so yes… i’m not really to familiar with the storj network mechanics… but brightsilence suggested the same thing, it’s a satellite issue.

i’m going to shutdown my nodes if the satellites audits start hitting 70% or so…
to avoid DQ

1 Like

I’m not seeing anything hit my dashboard yet, it’s still all 100%, but there are errors in my logs. My newest node seems hit the hardest.

1 of these:
021-07-23T00:30:03.080Z ERROR piecestore download failed {“Piece ID”: “M4JJX762SFXPUQGEMQWSIDRJHDN7KDCEW3X5ABXFGRY7ZEOJTDKQ”, “Satellite ID”: “121RTSDpyNZVcEU84Ticf2L1ntiuUimbWgfATz21tuvgk3vzoA6”, “Action”: “GET_AUDIT”, “error”: “file does not exist”

10 of these, same sat:
2021-07-23T00:30:03.080Z ERROR piecestore download failed {“Piece ID”: “M4JJX762SFXPUQGEMQWSIDRJHDN7KDCEW3X5ABXFGRY7ZEOJTDKQ”, “Satellite ID”: “121RTSDpyNZVcEU84Ticf2L1ntiuUimbWgfATz21tuvgk3vzoA6”, “Action”: “GET_AUDIT”, “error”: “file does not exist”

80 of these, mostly same sat:
Piece ID": “WXMCZYCAYEWVSOVWUL3L4NK26WL47OSE6UZ6LJFK6OIJJ3R4N4KA”, “Satellite ID”: “12L9ZFwhzVpuEKMUNUqkaTLGzwY9G24tbiigLiXpmZWKwmcNDDs”, “Action”: “PUT”, “error”: “unexpected EOF”

Seems to be another satelite giving bad audits as well.
Found checked wrong log.

2021-07-22T23:42:22.240Z ERROR piecestore download failed {“Piece ID”: “GJLVEDSUDLETGCTXQCG2LKLJUEWAN2LOKSRUSMOVZRMUA7PQSWLQ”, “Satellite ID”: “12 1RTSDpyNZVcEU84Ticf2L1ntiuUimbWgfATz21tuvgk3vzoA6”, “Action”: “GET_AUDIT”, “erro r”: “file does not exist”, “errorVerbose”: “file does not exist\n\tstorj.io/comm on/rpc/rpcstatus.Wrap:73\n\tstorj.io/storj/storagenode/piecestore.(*Endpoint).Do wnload:534\n\tstorj.io/common/pb.DRPCPiecestoreDescription.Method.func2:217\n\ts torj.io/drpc/drpcmux.(*Mux).HandleRPC:33\n\tstorj.io/common/rpc/rpctracing.(*Han dler).HandleRPC:58\n\tstorj.io/drpc/drpcserver.(*Server).handleRPC:102\n\tstorj. io/drpc/drpcserver.(*Server).ServeOne:60\n\tstorj.io/drpc/drpcserver.(*Server).S erve.func2:95\n\tstorj.io/drpc/drpcctx.(*Tracker).track:51”}

image

@deathlessdd @KernelPanick @SGC @waistcoat @Ted @LinuxNet
Please, search for all records from your logs with this piece.
Here is a problem only with delete expired and then download failed. If your node lost pieces for other reasons - it’s not related to the current problem with us2 satellite.

Unfortunately I dont have logs far back cause it recently updated, But if more and more people see the same issues its not cause my node lost a file.
But my node is still failing audits.

2021-07-22T23:53:46.965Z        ERROR   piecestore      download failed {"Piece ID": "ORX5MM6SCZOUJ5HPTU7QRDON34EKYLAJOUUUO4HC4WZ2HEKPEX6Q", "Satellite ID": "12tRQrMTWUWwzwGh18i7Fqs67kmdhH9t6aToeiwbo5mfS2rUmo", "Action": "GET_AUDIT", "error": "file does not exist", "errorVerbose": "file does not exist\n\tstorj.io/common/rpc/rpcstatus.Wrap:73\n\tstorj.io/storj/storagenode/piecestore.(*Endpoint).Download:534\n\tstorj.io/common/pb.DRPCPiecestoreDescription.Method.func2:217\n\tstorj.io/drpc/drpcmux.(*Mux).HandleRPC:33\n\tstorj.io/common/rpc/rpctracing.(*Handler).HandleRPC:58\n\tstorj.io/drpc/drpcserver.(*Server).handleRPC:102\n\tstorj.io/drpc/drpcserver.(*Server).ServeOne:60\n\tstorj.io/drpc/drpcserver.(*Server).Serve.func2:95\n\tstorj.io/drpc/drpcctx.(*Tracker).track:51"}
2021-07-23T10:28:23.241Z        ERROR   piecestore      download failed {"Piece ID": "RAYFOF5VE2LYJQBGXJ443VT3L3C7RLE2BDQCEJAXSL2VJJAO5OCA", "Satellite ID": "121RTSDpyNZVcEU84Ticf2L1ntiuUimbWgfATz21tuvgk3vzoA6", "Action": "GET_AUDIT", "error": "file does not exist", "errorVerbose": "file does not exist\n\tstorj.io/common/rpc/rpcstatus.Wrap:73\n\tstorj.io/storj/storagenode/piecestore.(*Endpoint).Download:534\n\tstorj.io/common/pb.DRPCPiecestoreDescription.Method.func2:217\n\tstorj.io/drpc/drpcmux.(*Mux).HandleRPC:33\n\tstorj.io/common/rpc/rpctracing.(*Handler).HandleRPC:58\n\tstorj.io/drpc/drpcserver.(*Server).handleRPC:102\n\tstorj.io/drpc/drpcserver.(*Server).ServeOne:60\n\tstorj.io/drpc/drpcserver.(*Server).Serve.func2:95\n\tstorj.io/drpc/drpcctx.(*Tracker).track:51"}

image

this is wide spread, tho doesn’t seem to affect all nodes yet…
there are no issues with my setup, i could barely make my system loose a piece if i tried…

only got logs from the 16th sadly … had some issues with my OS and i mostly just dumped them… because they was using an older logging method.

i also seen another 3 failed audits in the last couple of hours.
started a scrub on the 21th when i noticed this issue, and it’s done now… i don’t have a trace of any data corruption, and haven’t had downtime nor issues.

i’m trying to get some useful log info, but i’m not to use to actually retrieving data from my logs… i really need to get loki installed… so mostly i’ve just been learning how inept i am at searching through all this data in an accurate way lol

Perhaps your situation with audits failures on AP1 satellite is related to Audit scores dropping on ap1; Storj is working on a fix, SNO's don't need to do anything. ERROR piecedeleter could not send delete piece to trash >> Pieces error: v0pieceinfodb: sql: no rows in result set - #17 by BrightSilence

I didn’t blame your setup, sorry if I somehow pushed you to the such thought.
I just was needed excerpts from logs to be sure that situation is not related to the known issues Audit weirdness (us2 satellite). (Team aware. No additional information is needed) and Audit scores dropping on ap1; Storj is working on a fix, SNO's don't need to do anything. ERROR piecedeleter could not send delete piece to trash >> Pieces error: v0pieceinfodb: sql: no rows in result set - #17 by BrightSilence

1 Like

Never lost a file. Strangely, this has happened since yesterday and since there are several SNOs with the problem, this time I do not accept that there is something with the node. All nodes have been running continuously for several months except for updates.

I’ll turn off my nodes then too. Node 3 has already dropped to 90%.

We don’t have any other choice if we don’t want to lose all these months of work.