Download failed: "use of closed network connection" - Something wrong, or I just lost the race?

When you see log entries like this:

ERROR piecestore download failed {“Piece ID”: “B44BFBE2WL4WJRPXSRWEK2BHVQHTFDNUUMVQUNFAIYDTUVBEM6UQ”, “Satellite ID”: “12L9ZFwhzVpuEKMUNUqkaTLGzwY9G24tbiigLiXpmZWKwmcNDDs”, “Action”: “GET”, “error”: “write tcp 172.17.0.3:28967->176.9.121.114:54888: use of closed network connection”, “errorVerbose”: “write tcp 172.17.0.3:28967->176.9.121.114:54888: use of closed network connection\n\tstorj.io/drpc/drpcstream.(*Stream).pollWrite:228\n\tstorj.io/drpc/drpcwire.SplitN:29\n\tstorj.io/drpc/drpcstream.(*Stream).RawWrite:276\n\tstorj.io/drpc/drpcstream.(*Stream).MsgSend:322\n\tstorj.io/common/pb.(*drpcPiecestoreDownloadStream).Send:1118\n\tstorj.io/storj/storagenode/piecestore.(*Endpoint).Download.func5.1:580\n\tstorj.io/common/rpc/rpctimeout.Run.func1:22”}

Does that mean I have a hardware or config issue? Or that maybe that a few nodes were asked for a file… and responded before me… so the connection was closed by the time I did?

I do see many more successful uploads+downloads, so I believe things are healthy overall. If it makes a difference I did recently resume that node after being offline for 2 weeks, so my Online score is all mid-50%, so all 6 satellites suspended me. Crossing my fingers I can get back on the horse…

You got it :slight_smile:

1 Like

Hey Everyone,

I got the same error a couple of days ago. My node looks healthy and everything is ~100%. I’m wondering, why does this not show up as ‘WARN’ or ‘INFO’, if it just means ‘I lost the race’?

Once upon a time, the uplink was sending a message to the storage node before closing the channel. That is an additional roundtrip. For faster downloads, the uplink nowadays just terminates the connection without any hint. The storage node is forced to guess and will produce different messages depending on the situation in which this happens. Most of the time the storage node will assume the reason was the longtail cancelation but in some rare cases, it can also produce other error messages.

4 Likes

The same error, but appeared after moving a node to other location.
Just after migration, there were error messages like
ERROR contact:service ping satellite failed {“Process”: “storagenode”, “Satellite ID”: “12rfG3sh9NCWiX3ivPjq2HtdLmbqCrvHVEzJubnzFzosMuawymB”, “attempts”: 1, “error”: “ping satellite: check-in ratelimit: node rate limited by id”
Later, there was a pause without any logging.
Eventually, there are error messages like

2023-01-18T16:39:36.530Z ERROR piecestore download failed {Process: storagenode, Piece ID: 4LTFVFUWBJ6FYGSTXOTJANVU5IA6EONJDBBZH4UKDBGKTI54I3UQ, Satellite ID: 12L9ZFwhzVpuEKMUNUqkaTLGzwY9G24tbiigLiXpmZWKwmcNDDs, Action: GET, error: context deadline exceeded, errorVerbose: context deadline exceeded\n\tstorj.io/common/rpc/rpcstatus.Wrap:73\n\tstorj.io/storj/storagenode/piecestore.(*Endpoint).Download.func6:745\n\tstorj.io/storj/storagenode/piecestore.(*Endpoint).Download:763\n\tstorj.io/common/pb.DRPCPiecestoreDescription.Method.func2:243\n\tstorj.io/drpc/drpcmux.(*Mux).HandleRPC:33\n\tstorj.io/common/rpc/rpctracing.(*Handler).HandleRPC:61\n\tstorj.io/common/experiment.(*Handler).HandleRPC:42\n\tstorj.io/drpc/drpcserver.(*Server).handleRPC:122\n\tstorj.io/drpc/drpcserver.(*Server).ServeOne:66\n\tstorj.io/drpc/drpcserver.(*Server).Serve.func2:112\n\tstorj.io/drpc/drpcctx.(*Tracker).track:52}
2023-01-18T16:41:52.066Z ERROR piecestore upload failed {Process: storagenode, Piece ID: 2OBSKECETIC3HL534D4WBLQF7AVNG442RMIVKX3IPGNP66YBR4ZA, Satellite ID: 12EayRS2V1kEsWESU9QMRseFhdxYxKicsiFmxrsLZHeLUtdps3S, Action: PUT, error: unexpected EOF, errorVerbose: unexpected EOF\n\tstorj.io/common/rpc/rpcstatus.Error:82\n\tstorj.io/storj/storagenode/piecestore.(*Endpoint).Upload:418\n\tstorj.io/common/pb.DRPCPiecestoreDescription.Method.func1:235\n\tstorj.io/drpc/drpcmux.(*Mux).HandleRPC:33\n\tstorj.io/common/rpc/rpctracing.(*Handler).HandleRPC:61\n\tstorj.io/common/experiment.(*Handler).HandleRPC:42\n\tstorj.io/drpc/drpcserver.(*Server).handleRPC:122\n\tstorj.io/drpc/drpcserver.(*Server).ServeOne:66\n\tstorj.io/drpc/drpcserver.(*Server).Serve.func2:112\n\tstorj.io/drpc/drpcctx.(*Tracker).track:52, Size: 532480}

Both of these are where your internet or node is too slow.

This is where your node couldn’t upload the file quick enough, so it lost the race and the ‘customer’ disconnected.

This is when you can’t download or save the file quick enough, so your node lost the race.

Edit - updated after @nerdatwork pointed out my error. GET = client download (download in log), PUT = client upload (upload in log).

2 Likes

I’s VERY strange.
I have another node on the same computer (another docker container), with exactly the same settings (except port number) - and there is no any errors. Downloads and uploads very often, all successful.
With this one, there are absolutely no successful uploads or downloads. Node status - Online, QUIC - OK.

PS The case (actually it’s NOT a problem) resolved.
I forgot that on the first computer, from witch I migrated the node, I had set log level = error, that’s why I couldn’t see successful downloads/uploads with INFO status.
Actually I saw INFO messages but only now realized that all of them were from storagenode-updater!
@Stob Thank you, I realized this only after your post!

2 Likes

I hope you know that the upload and download terms are from client’s perspective. What shows as upload is actually download for the SNO and download is upload from the SNO. Client is uploading to your node and downloading from your node.

1 Like