Upload "Remote Address" use docker Client public IP and not "ADDRESS=" endpoint

Hello, I wanted to ask if this is ok.

When uploading a file “upload file” it uses a uPnP port and uses the client endpoint with the current ip.

In my case I have it like this:

sub1.dom1.net:28281 (Public IP1) — “Redirect traffic” —> xxx.xxx.xxx.xxxx:28281 (Public IP2) → Docker StorJ node

But the endpoint does not use the external IP that was configured in Docker but directly the “Public IP2” with a random port, and that give me a lot of fails in the logs.

Thanks

Docker:

environment:
      - ADDRESS=sub1.dom1.net:28281
"Remote Address": "xxx.xxx.xxx.xx:38344"
"Remote Address": "xxx.xxx.xxx.xx:38342"
"Remote Address": "xxx.xxx.xxx.xx:38286"
"Remote Address": "xxx.xxx.xxx.xx:38348"
node3132            | 2023-04-22T17:58:03.275Z  ERROR   piecestore      upload failed   {"Process": "storagenode", "Piece ID": "2SG7Q22HVZIO4G5VBURKXVJ56C45LQ7NKZOC6NO3CRHRDRGRUDMA", "Satellite ID": "12EayRS2V1kEsWESU9QMRseFhdxYxKicsiFmxrsLZHeLUtdps3S", "Action": "PUT", "error": "context canceled", "errorVerbose": "context canceled\n\tstorj.io/common/rpc/rpcstatus.Wrap:75\n\tstorj.io/storj/storagenode/piecestore.(*Endpoint).Upload.func5:498\n\tstorj.io/storj/storagenode/piecestore.(*Endpoint).Upload:504\n\tstorj.io/common/pb.DRPCPiecestoreDescription.Method.func1:243\n\tstorj.io/drpc/drpcmux.(*Mux).HandleRPC:33\n\tstorj.io/common/rpc/rpctracing.(*Handler).HandleRPC:61\n\tstorj.io/common/experiment.(*Handler).HandleRPC:42\n\tstorj.io/drpc/drpcserver.(*Server).handleRPC:124\n\tstorj.io/drpc/drpcserver.(*Server).ServeOne:66\n\tstorj.io/drpc/drpcserver.(*Server).Serve.func2:114\n\tstorj.io/drpc/drpcctx.(*Tracker).track:35", "Size": 7936, "Remote Address": "xxx.xxx.xxx.xx:38344"}
node3132            | 2023-04-22T17:58:03.292Z  ERROR   piecestore      upload failed   {"Process": "storagenode", "Piece ID": "CKVBOWAMXZWRKP5ASVE2UZ2QEI3VCOZ4HHBRN2TRE7EGJJUIEM5Q", "Satellite ID": "12EayRS2V1kEsWESU9QMRseFhdxYxKicsiFmxrsLZHeLUtdps3S", "Action": "PUT", "error": "context canceled", "errorVerbose": "context canceled\n\tstorj.io/common/rpc/rpcstatus.Wrap:75\n\tstorj.io/storj/storagenode/piecestore.(*Endpoint).Upload.func5:498\n\tstorj.io/storj/storagenode/piecestore.(*Endpoint).Upload:504\n\tstorj.io/common/pb.DRPCPiecestoreDescription.Method.func1:243\n\tstorj.io/drpc/drpcmux.(*Mux).HandleRPC:33\n\tstorj.io/common/rpc/rpctracing.(*Handler).HandleRPC:61\n\tstorj.io/common/experiment.(*Handler).HandleRPC:42\n\tstorj.io/drpc/drpcserver.(*Server).handleRPC:124\n\tstorj.io/drpc/drpcserver.(*Server).ServeOne:66\n\tstorj.io/drpc/drpcserver.(*Server).Serve.func2:114\n\tstorj.io/drpc/drpcctx.(*Tracker).track:35", "Size": 7936, "Remote Address": "xxx.xxx.xxx.xx:38342"}
node3132            | 2023-04-22T17:58:03.315Z  INFO    piecestore      uploaded        {"Process": "storagenode", "Piece ID": "TPFWCRN4KSZVOCUVQSJLVNJ33W7WDMAQY3CPE5NLMOLVR5YDE5CA", "Satellite ID": "12EayRS2V1kEsWESU9QMRseFhdxYxKicsiFmxrsLZHeLUtdps3S", "Action": "PUT", "Size": 36864, "Remote Address": "xxx.xxx.xxx.xx:38286"}
node3132            | 2023-04-22T17:58:03.321Z  ERROR   piecestore      upload failed   {"Process": "storagenode", "Piece ID": "ZUXDRGSV7UXHA3LABVD5GYGZHP447EZXVYVPUMMZWFS4WQSQTH2A", "Satellite ID": "12EayRS2V1kEsWESU9QMRseFhdxYxKicsiFmxrsLZHeLUtdps3S", "Action": "PUT", "error": "context canceled", "errorVerbose": "context canceled\n\tstorj.io/common/rpc/rpcstatus.Wrap:75\n\tstorj.io/storj/storagenode/piecestore.(*Endpoint).Upload.func5:498\n\tstorj.io/storj/storagenode/piecestore.(*Endpoint).Upload:504\n\tstorj.io/common/pb.DRPCPiecestoreDescription.Method.func1:243\n\tstorj.io/drpc/drpcmux.(*Mux).HandleRPC:33\n\tstorj.io/common/rpc/rpctracing.(*Handler).HandleRPC:61\n\tstorj.io/common/experiment.(*Handler).HandleRPC:42\n\tstorj.io/drpc/drpcserver.(*Server).handleRPC:124\n\tstorj.io/drpc/drpcserver.(*Server).ServeOne:66\n\tstorj.io/drpc/drpcserver.(*Server).Serve.func2:114\n\tstorj.io/drpc/drpcctx.(*Tracker).track:35", "Size": 7680, "Remote Address": "xxx.xxx.xxx.xx:38348"}

func getRemoteAddr(

If your node is online - it’s OK. It cannot win all races, some nodes are faster.
Remote addresses here are customers addresses, which connecting to your node, not yours addresses.

1 Like

But I think we still have this issue:

I just have checked it by myself on one node. A piece for that a upload failed context canceled error has been thrown is sitting happily in its correct destination folder.
If the file has been successfully uploaded which seems to be the case (I mean maybe still it is not successful and it gets later moved to trash or something) then this error is rubbish and completely misleading. In that case it should not be thrown but show a successful upload. Because it is definitely not a failed upload.

These are different reasons, please do not mix everything together.
In the current thread the author has a different stack trace, pointing out, that this transfer was canceled by the client, not because of the mentioned error in your excerpt.

But this is what I am trying to tell you. I had the same error like OP but the piece has been uploaded to my node. So as a result it seems that what @kaloyan has stated that the logs get polluted with wrong error messages is still present.

This is interesting, do you have logs that the piece from the canceled transfer is actually uploaded? Like later it was successfully audited or downloaded?
Because here the stack-trace is different (may be due to a different versions as I started to think).

No it is a very fresh piece. In fact it is just 1 hour old or something.
But I picked it at complete random just because of the error message context canceled.

I have the upload started message, followed by upload failed with the same error stated like OP. And when I check for the piece, I can see it in the destination folder.

Maybe OP should check as well if the pieces in questions are there or not (if not deleted).

By the way, the mentioned PR is merged, so this change in the current version already.

That’s why I am wondering why this behavior is still (or again) present.

I don’t know if work on this issue report will fix it as it has a download failed message as example reference.

Basically what would be needed is to show a successful upload as successful in the logs.

@Alexey is normal this amount of Failed upload ? or i should ignore them ? thanks

---------- accepted ----------- 
-e Failed:                63798 
-e Fail Rate:             0.000%
-e Canceled:              622 
-e Cancel Rate:           0.000%
-e Successful:            75069 
-e Success Rate:          0.000%

Full report:

sh ./successrate.sh nodex
-e ========== AUDIT ============== 
-e Critically failed:     0 
-e Critical Fail Rate:    -nan%
-e Recoverable failed:    0 
-e Recoverable Fail Rate: 0.000%
-e Successful:            7421 
-e Success Rate:          0.000%
-e ========== DOWNLOAD =========== 
-e Failed:                127 
-e Fail Rate:             0.000%
-e Canceled:              16134 
-e Cancel Rate:           0.000%
-e Successful:            18466 
-e Success Rate:          0.000%
-e ========== UPLOAD ============= 
-e Rejected:              0 
-e Acceptance Rate:       100.000%
-e ---------- accepted ----------- 
-e Failed:                63798 
-e Fail Rate:             0.000%
-e Canceled:              622 
-e Cancel Rate:           0.000%
-e Successful:            75069 
-e Success Rate:          0.000%
-e ========== REPAIR DOWNLOAD ==== 
-e Failed:                0 
-e Fail Rate:             0.000%
-e Canceled:              0 
-e Cancel Rate:           0.000%
-e Successful:            10351 
-e Success Rate:          0.000%
-e ========== REPAIR UPLOAD ====== 
-e Failed:                5341 
-e Fail Rate:             0.000%
-e Canceled:              6 
-e Cancel Rate:           0.000%
-e Successful:            20704 
-e Success Rate:          0.000%
-e ========== DELETE ============= 
-e Failed:                0 
-e Fail Rate:             -nan%
-e Successful:            15530 
-e Success Rate:          0.000%

If your node doesn’t have failed audits and the storagenode reputation is OK, then node functioning normally.
Regarding failed transfers - well, customers cancel excess connections on upload or download. How is it normal, you need to check in

But it’s not normal to have a failed upload in the log but the piece has been successfully uploaded.

Timings. The client can cancel exactly when the piece is uploaded, but before the confirmation has been received.
So, it will be cared by the Garbage Collector.

This happens too often to be such a coincidence.

Just in this second I have checked the logs of a randomly selected node. I checked the last log entry with upload context canceled just a couple of minutes ago.
Result: The log shows upload failed
But the piece is there in the blob folder.

So how can this be a failed upload?

To me this looks more like the code has been “optimized” for faster closing of connections, resulting in incorrect log entries.

The client does not notify storagenodes when it cancel request for speed up purposes (it just forcibly closes connections instead). I know, this is inconvenient for SNO, because success rate script become almost useless, it doesn’t show the true picture, how many transfers are explicitly canceled, and how much requests were failed for other reasons.

Why aren’t such pieces not deleted right away if the upload is considered failed? The way you describe it, they are only wasting space.

We don’t necessarily know that the upload failed, one example the computer running the uplink might have died, so it can’t kick off deletes. We could have the storage node not save data that it never sends back a signature for, but we think that there are a lot storage nodes that do successfully send back signatures even when they aren’t selected by the uplink. This is something we could improve for some edge cases, but this is a hard problem to solve in general.

Yet it gets displayed as failed in the logs.
For me as SNO I’d consider an upload successful when the piece has been dropped in the blobs folder. If this still does not count as successfull, the piece should get deleted right away.
If the piece has not been dropped into the blobs folders, then its obvious that the upload did not succeed.
I have no other options to judge.

I think it’s related to what I posted here.

Yes, it seems that this is exactly the same issue.