Error Codes: What they mean and Severity Level [READ FIRST]

6 posts were split to a new topic: Usedserialsdb error: disk I/O error: The device is not ready

Post your log between 3 backticks ```

2 posts were split to a new topic: Node offline, messed up with identity

A post was split to a new topic: Pieces error: marshaled piece header too big!

Hello,

I got this error today (It happened only once) :

2021-03-11T17:02:11.121Z	ERROR	piecestore	download failed	{"Piece ID": "WY64G3BUUNWGA7QD2ZVY33HXSIQ3ZLBWFFD7R2RCOSLTE35FXRCQ", "Satellite ID": "12L9ZFwhzVpuEKMUNUqkaTLGzwY9G24tbiigLiXpmZWKwmcNDDs", "Action": "GET", "error": "file does not exist", "errorVerbose": "file does not exist\n\tstorj.io/common/rpc/rpcstatus.Wrap:74\n\tstorj.io/storj/storagenode/piecestore.(*Endpoint).Download:506\n\tstorj.io/common/pb.DRPCPiecestoreDescription.Method.func2:1033\n\tstorj.io/drpc/drpcmux.(*Mux).HandleRPC:29\n\tstorj.io/common/rpc/rpctracing.(*Handler).HandleRPC:58\n\tstorj.io/drpc/drpcserver.(*Server).handleRPC:111\n\tstorj.io/drpc/drpcserver.(*Server).ServeOne:62\n\tstorj.io/drpc/drpcserver.(*Server).Serve.func2:99\n\tstorj.io/drpc/drpcctx.(*Tracker).track:51"}

The disk shows no error. The node never hung-up, just the regular Storj container upgrades restarts.

Is there something I need to worry about ?

All the best !

You lost the piece that is why it says “file does not exist”. Did you have a sudden restart or power outage ?

Nope, never. the node is on an UPS

How old is your node ? You can search your log when the piece was uploaded. Your hardware could have restarted a month back and this only showed up now when the piece was getting downloaded.

Hello,

Thanks for your reply.

The node is up from early January, but I cleared the log after the upload.

I’ve searched more into the log.

  • A grep on the file returns this interesting entries :
2021-03-09T23:43:34.521Z INFO piecestore download started {“Piece ID”: “WY64G3BUUNWGA7QD2ZVY33HXSIQ3ZLBWFFD7R2RCOSLTE35FXRCQ”, “Satellite ID”: “12L9ZFwhzVpuEKMUNUqkaTLGzwY9G24tbiigLiXpmZWKwmcNDDs”, “Action”: “GET”}
2021-03-09T23:43:35.479Z INFO piecestore downloaded {“Piece ID”: “WY64G3BUUNWGA7QD2ZVY33HXSIQ3ZLBWFFD7R2RCOSLTE35FXRCQ”, “Satellite ID”: “12L9ZFwhzVpuEKMUNUqkaTLGzwY9G24tbiigLiXpmZWKwmcNDDs”, “Action”: “GET”}
2021-03-10T20:23:47.160Z INFO **collector delete expired** {“Satellite ID”: “12L9ZFwhzVpuEKMUNUqkaTLGzwY9G24tbiigLiXpmZWKwmcNDDs”, “Piece ID”: “WY64G3BUUNWGA7QD2ZVY33HXSIQ3ZLBWFFD7R2RCOSLTE35FXRCQ”}
2021-03-10T21:42:54.825Z INFO piecestore download started {“Piece ID”: “WY64G3BUUNWGA7QD2ZVY33HXSIQ3ZLBWFFD7R2RCOSLTE35FXRCQ”, “Satellite ID”: “12L9ZFwhzVpuEKMUNUqkaTLGzwY9G24tbiigLiXpmZWKwmcNDDs”, “Action”: “GET”}
2021-03-10T21:42:54.873Z ERROR piecestore download failed {“Piece ID”: “WY64G3BUUNWGA7QD2ZVY33HXSIQ3ZLBWFFD7R2RCOSLTE35FXRCQ”, “Satellite ID”: “12L9ZFwhzVpuEKMUNUqkaTLGzwY9G24tbiigLiXpmZWKwmcNDDs”, “Action”: “GET”, “error”: “file does not exist”, “errorVerbose”: “file does not exist\n\tstorj.io/common/rpc/rpcstatus.Wrap:74\n\tstorj.io/storj/storagenode/piecestore.(*Endpoint).Download:506\n\tstorj.io/common/pb.DRPCPiecestoreDescription.Method.func2:1033\n\tstorj.io/drpc/drpcmux.(*Mux).HandleRPC:29\n\tstorj.io/common/rpc/rpctracing.(*Handler).HandleRPC:58\n\tstorj.io/drpc/drpcserver.(*Server).handleRPC:111\n\tstorj.io/drpc/drpcserver.(*Server).ServeOne:62\n\tstorj.io/drpc/drpcserver.(*Server).Serve.func2:99\n\tstorj.io/drpc/drpcctx.(*Tracker).track:51”}

What’s interesting is that happening after the " delete expired " collector message yesterday.

  • Then I greped on the “delete expired” string and I got many files from 4 to 11 March 2021, all from the EU West satellite.

So How a file which is supposed to be deleted is requested for download ? There is something that I don’t understand.

Hi, this is an expected behavior of the system.

There are a bunch of things that happen asynchronously, hence there’s some eventual consistency behaviors that might not be obvious at first sight.

Let’s take an example scenario that demonstrates it:

  1. At 11:15:00 Alice on computer A sends a request to download an object to the satellite, then satellite responds with the needed order limits for the specific storage nodes.
  2. At 11:15:01 Bob on computer B sends a delete request for that object, which ends up sending a delete request to the storage node. Storage node deletes the piece.
  3. At 11:15:03 Alice on computer A starts using the order limits (there was a latency spike on network, which caused it to take a bit longer)… however, now, none of the storage nodes have that data anymore.

This ends up in the logs as:

  1. object deleted
  2. download started
  3. download failed

With object expiration, the timeframe when this can happen is larger, because the satellite takes more time to delete expired objects/segments than the storage node.

6 Likes

Hello,

Thanks for the explanation. It’s clear now.

I was just worrying that something happens on my node potentially causing issue, especially for the GE process.

All the best !

7 posts were split to a new topic: Error: certificates peer error: authorization already claimed:

./identity.exe authorize storejnode email:*****@gmail.com:1XtU1ovrtStgSavWMrZoRfmuZwGHdm1P7nyKDm3nxuerx9pPQcMb1YiaaBT6YUDPMnFWz5aX5622PFv9Szaxz4ZAywokex
Error: certificates peer error: authorization not found:

Did you have that text in the command ?

It should be

/identity.exe authorize storejnode abaig60156@gmail.com:1XtU1ovrtStgSavWMrZoRfmuZwGHdm1P7nyKDm3nxuerx9pPQcMb1YiaaBT6YUDPMnFWz5aX5622PFv9Szaxz4ZAywokex