Hi, One of my nodes, over the last 4 weeks, it has failed 8x Audit requests. Node is running on default backend - piecestore. Am I missing data somehow?
Each of the failed Audit looks like:
2025-05-12T18:06:07Z ERROR piecestore download failed {“Process”: “storagenode”, “Piece ID”: “BOOPW75ZPRAFWXIMI5ABY4TSBTMHTHZLRTTHJLIRPT7VSUCOMNJA”, “Satellite ID”: “12EayRS2V1kEsWESU9QMRseFhdxYxKicsiFmxrsLZHeLUtdps3S”, “Action”: “GET_AUDIT”, “Offset”: 1762048, “Size”: 256, “Remote Address”: “35.188.235.2:15964”, “error”: “hashstore: file does not exist”, “errorVerbose”: “hashstore: file does not exist\n\tstorj.io/storj/storagenode/hashstore.(*DB).Read:359\n\tstorj.io/storj/storagenode/piecestore.(*HashStoreBackend).Reader:298\n\tstorj.io/storj/storagenode/piecestore.(*MigratingBackend).Reader:180\n\tstorj.io/storj/storagenode/piecestore.(*TestingBackend).Reader:105\n\tstorj.io/storj/storagenode/piecestore.(*Endpoint).Download:676\n\tstorj.io/common/pb.DRPCPiecestoreDescription.Method.func2:302\n\tstorj.io/drpc/drpcmux.(*Mux).HandleRPC:33\n\tstorj.io/common/rpc/rpctracing.(*Handler).HandleRPC:62\n\tstorj.io/common/experiment.(*Handler).HandleRPC:43\n\tstorj.io/drpc/drpcserver.(*Server).handleRPC:166\n\tstorj.io/drpc/drpcserver.(*Server).ServeOne:108\n\tstorj.io/drpc/drpcserver.(*Server).Serve.func2:156\n\tstorj.io/drpc/drpcctx.(*Tracker).track:35”}
Are you seeing the audit % drop in the UI… or is this only something you see in your logs? You could chkdsk/fsck/scrub to see if there are filesystem errors… but if I didn’t see my audit number dropping I’d probably just ignore it.
Maybe audits are coming in: and both the piecestore and hashstore backends are being triggered… and since you’re not using hashstore it’s just failing but not affecting anything?
Interesting. Perhaps it was a result of moving this piece to the trash and the sent recover command didn’t restore it (because of too late).
I do not remember, is PieceID logged, when it’s moved to the trash on info level?
Or there was a real piece lost, hashstore checked after piecestore.
I have a one more idea - could you please search one of the piece on your disk in the piecestore?
See how to convert a PieceID to the file:
Then these pieces are actually lost. However, a little bit concerning the fact that you have no evidence, that these pieces were ever uploaded to your node.
Concerning that I have missing data. Node is 15 months old, and I only keep six weeks of logs, not surprising I don’t have any other record of missing files.
. What surprised me, was that it’s looking in Hashstore for data, but that’s not enabled.
Hello. Yes, you’re right. I checked, and my old logs were deleted. I forgot to restart journald after increasing the log space
Unfortunately, now we can only guess what happened to this file. Similar issues haven’t occurred since, the node is new, only a month old. SMART is ok, filesystem ext4 (Linux host)
Everything is fine with this it’s a server in a data center. The only thing I did was increase the available disk space in the configuration file and restart the storagenode process via systemctl
The satellite may try to request it again, this would happen until it will not be deleted by the customer or if this segment would fail below the repair threshold and the pointer for this piece on your node could be removed.
Thanks for the information. There have been no more requests for this segment at this time, and the audit is again approaching 100%. (From 99.91 to 99.95)