BUG? Multiple call to on piece

lsof /opt/storj/data | awk '{print $9}' | sort | uniq -c
      1 NAME
      2 /opt/storj/data/revocations.db
      3 /opt/storj/data/storage/blobs/ukfu6bhbboxilvt7jrwlqk7y2tapb5d2r2tsmj2sjxvw5qaaaaaa/7s/fqenepu2wj2mox6slrskkawt5iz6if6aahfgtcdtqyt2oi34qq.sj1
     13 /opt/storj/data/storage/blobs/ukfu6bhbboxilvt7jrwlqk7y2tapb5d2r2tsmj2sjxvw5qaaaaaa/mf/zjd6bz5dfmu6fdqxzmrwujj4yp6bx2vvzweg2pticmdkj6h6ua.sj1
      2 /opt/storj/data/storage/blobs/ukfu6bhbboxilvt7jrwlqk7y2tapb5d2r2tsmj2sjxvw5qaaaaaa/ua/xzguq5duxziownu4xoa4tusrrhe55z6wx7f2emkjuk34bhtb5a.sj1
      4 /opt/storj/data/storage/trash/ukfu6bhbboxilvt7jrwlqk7y2tapb5d2r2tsmj2sjxvw5qaaaaaa/2024-11-04/3s

From what I’ve observed in the latest versions of the storj node, for some reason, the storj process has started reading the same file multiple times per request - anywhere from 1 to 20 times instead of just once. This is clearly visible in the ZFS statistics, where there’s an increase to almost 100% in MFU hits within the L2ARC/ARC caches.

1 Like

Reading a piece file 20 times for one request would indeed be a major problem. Do you have any additional corroborating info you can provide- ideally something like strace output (e.g. strace -p $STORAGENODE_PID -tt -f -o outputfile) paired with logs?

Note that the lsof output is not necessarily helpful, as having multiple handles open to a file (across all processes) is not the same as one process reading that file many times in succession.

2 Likes

Are you sure you do not simply have multiple concurrent requests for the same file? I do see this kind of query “storm” from time to time, when a customer fetches the same file dozens or hundreds of times in a very short time span.

2 Likes

I am seeing the same thing, a storm of reads for the same piece, multiple sats are making the request at the same time. I also see a lot of errors where it says the size mismatches but looking at the message, the size is the same. This started with the latest version upgrade

“Action”: “GET”, “Offset”: 0, “Size”: 12544, “Remote Address”: “79.127.226.105:60914”, “reason”: “downloaded size (0 bytes) does not match received message size (12544 bytes)”}

I have seen a significant rise of egress traffic in my router stats in the last days, also the number of firewall states has skyrocket. I was thinking this is because of a new customer with high egress use case but who knows… :thinking:

1 Like

how can multiple satellites access the same piece?

3 Likes

You are correct. Another piece request slipped into the storm and if i wasnt running on three hours sleep this morning i would have realized that is impossible

Meaning that your node was unable to gave even so small amount of data before cancelation. Your node loosed the race to more fast nodes.

I experienced the same behavior recently. For example, in the logs I can find successful downloads for piece FEF6AZW2GBMVAORPXJC7H6VKTHNPXX5ARDYUOF2I6P743QXOSFXQ more than 100 times in just a minute time span. A couple of these log lines for reference:

2024-11-08T13:48:35Z    INFO    piecestore      downloaded      {"Process": "storagenode", "Piece ID": "FEF6AZW2GBMVAORPXJC7H6VKTHNPXX5ARDYUOF2I6P743QXOSFXQ", "Satellite ID": "12EayRS2V1kEsWESU9QMRseFhdxYxKicsiFmxrsLZHeLUtdps3S", "Action": "GET", "Offset": 0, "Size": 14080, "Remote Address": "109.61.92.86:40948"}
2024-11-08T13:48:35Z    INFO    piecestore      downloaded      {"Process": "storagenode", "Piece ID": "FEF6AZW2GBMVAORPXJC7H6VKTHNPXX5ARDYUOF2I6P743QXOSFXQ", "Satellite ID": "12EayRS2V1kEsWESU9QMRseFhdxYxKicsiFmxrsLZHeLUtdps3S", "Action": "GET", "Offset": 0, "Size": 14080, "Remote Address": "109.61.92.86:40902"}
2024-11-08T13:48:35Z    INFO    piecestore      downloaded      {"Process": "storagenode", "Piece ID": "FEF6AZW2GBMVAORPXJC7H6VKTHNPXX5ARDYUOF2I6P743QXOSFXQ", "Satellite ID": "12EayRS2V1kEsWESU9QMRseFhdxYxKicsiFmxrsLZHeLUtdps3S", "Action": "GET", "Offset": 0, "Size": 14080, "Remote Address": "79.127.219.43:47940"}
2024-11-08T13:48:35Z    INFO    piecestore      downloaded      {"Process": "storagenode", "Piece ID": "FEF6AZW2GBMVAORPXJC7H6VKTHNPXX5ARDYUOF2I6P743QXOSFXQ", "Satellite ID": "12EayRS2V1kEsWESU9QMRseFhdxYxKicsiFmxrsLZHeLUtdps3S", "Action": "GET", "Offset": 0, "Size": 14080, "Remote Address": "79.127.219.43:47516"}
2024-11-08T13:48:35Z    INFO    piecestore      downloaded      {"Process": "storagenode", "Piece ID": "FEF6AZW2GBMVAORPXJC7H6VKTHNPXX5ARDYUOF2I6P743QXOSFXQ", "Satellite ID": "12EayRS2V1kEsWESU9QMRseFhdxYxKicsiFmxrsLZHeLUtdps3S", "Action": "GET", "Offset": 0, "Size": 14080, "Remote Address": "79.127.219.43:47968"}
2024-11-08T13:48:36Z    INFO    piecestore      downloaded      {"Process": "storagenode", "Piece ID": "FEF6AZW2GBMVAORPXJC7H6VKTHNPXX5ARDYUOF2I6P743QXOSFXQ", "Satellite ID": "12EayRS2V1kEsWESU9QMRseFhdxYxKicsiFmxrsLZHeLUtdps3S", "Action": "GET", "Offset": 0, "Size": 14080, "Remote Address": "109.61.92.86:41038"}
2024-11-08T13:48:36Z    INFO    piecestore      downloaded      {"Process": "storagenode", "Piece ID": "FEF6AZW2GBMVAORPXJC7H6VKTHNPXX5ARDYUOF2I6P743QXOSFXQ", "Satellite ID": "12EayRS2V1kEsWESU9QMRseFhdxYxKicsiFmxrsLZHeLUtdps3S", "Action": "GET", "Offset": 0, "Size": 14080, "Remote Address": "109.61.92.86:41090"}
2024-11-08T13:48:36Z    INFO    piecestore      downloaded      {"Process": "storagenode", "Piece ID": "FEF6AZW2GBMVAORPXJC7H6VKTHNPXX5ARDYUOF2I6P743QXOSFXQ", "Satellite ID": "12EayRS2V1kEsWESU9QMRseFhdxYxKicsiFmxrsLZHeLUtdps3S", "Action": "GET", "Offset": 0, "Size": 14080, "Remote Address": "109.61.92.86:41016"}
2024-11-08T13:48:36Z    INFO    piecestore      downloaded      {"Process": "storagenode", "Piece ID": "FEF6AZW2GBMVAORPXJC7H6VKTHNPXX5ARDYUOF2I6P743QXOSFXQ", "Satellite ID": "12EayRS2V1kEsWESU9QMRseFhdxYxKicsiFmxrsLZHeLUtdps3S", "Action": "GET", "Offset": 0, "Size": 14080, "Remote Address": "109.61.92.86:41068"}
2024-11-08T13:48:36Z    INFO    piecestore      downloaded      {"Process": "storagenode", "Piece ID": "FEF6AZW2GBMVAORPXJC7H6VKTHNPXX5ARDYUOF2I6P743QXOSFXQ", "Satellite ID": "12EayRS2V1kEsWESU9QMRseFhdxYxKicsiFmxrsLZHeLUtdps3S", "Action": "GET", "Offset": 0, "Size": 14080, "Remote Address": "109.61.92.86:41054"}
2024-11-08T13:48:36Z    INFO    piecestore      downloaded      {"Process": "storagenode", "Piece ID": "FEF6AZW2GBMVAORPXJC7H6VKTHNPXX5ARDYUOF2I6P743QXOSFXQ", "Satellite ID": "12EayRS2V1kEsWESU9QMRseFhdxYxKicsiFmxrsLZHeLUtdps3S", "Action": "GET", "Offset": 0, "Size": 14080, "Remote Address": "109.61.92.86:41106"}
2024-11-08T13:48:36Z    INFO    piecestore      downloaded      {"Process": "storagenode", "Piece ID": "FEF6AZW2GBMVAORPXJC7H6VKTHNPXX5ARDYUOF2I6P743QXOSFXQ", "Satellite ID": "12EayRS2V1kEsWESU9QMRseFhdxYxKicsiFmxrsLZHeLUtdps3S", "Action": "GET", "Offset": 0, "Size": 14080, "Remote Address": "109.61.92.86:41190"}
2024-11-08T13:48:36Z    INFO    piecestore      downloaded      {"Process": "storagenode", "Piece ID": "FEF6AZW2GBMVAORPXJC7H6VKTHNPXX5ARDYUOF2I6P743QXOSFXQ", "Satellite ID": "12EayRS2V1kEsWESU9QMRseFhdxYxKicsiFmxrsLZHeLUtdps3S", "Action": "GET", "Offset": 0, "Size": 14080, "Remote Address": "79.127.219.43:47988"}
2024-11-08T13:48:36Z    INFO    piecestore      downloaded      {"Process": "storagenode", "Piece ID": "FEF6AZW2GBMVAORPXJC7H6VKTHNPXX5ARDYUOF2I6P743QXOSFXQ", "Satellite ID": "12EayRS2V1kEsWESU9QMRseFhdxYxKicsiFmxrsLZHeLUtdps3S", "Action": "GET", "Offset": 0, "Size": 14080, "Remote Address": "109.61.92.86:41386"}
2024-11-08T13:48:36Z    INFO    piecestore      downloaded      {"Process": "storagenode", "Piece ID": "FEF6AZW2GBMVAORPXJC7H6VKTHNPXX5ARDYUOF2I6P743QXOSFXQ", "Satellite ID": "12EayRS2V1kEsWESU9QMRseFhdxYxKicsiFmxrsLZHeLUtdps3S", "Action": "GET", "Offset": 0, "Size": 14080, "Remote Address": "109.61.92.86:41268"}
2024-11-08T13:48:36Z    INFO    piecestore      downloaded      {"Process": "storagenode", "Piece ID": "FEF6AZW2GBMVAORPXJC7H6VKTHNPXX5ARDYUOF2I6P743QXOSFXQ", "Satellite ID": "12EayRS2V1kEsWESU9QMRseFhdxYxKicsiFmxrsLZHeLUtdps3S", "Action": "GET", "Offset": 0, "Size": 14080, "Remote Address": "109.61.92.86:41224"}
2024-11-08T13:48:36Z    INFO    piecestore      downloaded      {"Process": "storagenode", "Piece ID": "FEF6AZW2GBMVAORPXJC7H6VKTHNPXX5ARDYUOF2I6P743QXOSFXQ", "Satellite ID": "12EayRS2V1kEsWESU9QMRseFhdxYxKicsiFmxrsLZHeLUtdps3S", "Action": "GET", "Offset": 0, "Size": 14080, "Remote Address": "109.61.92.86:41238"}
2024-11-08T13:48:36Z    INFO    piecestore      downloaded      {"Process": "storagenode", "Piece ID": "FEF6AZW2GBMVAORPXJC7H6VKTHNPXX5ARDYUOF2I6P743QXOSFXQ", "Satellite ID": "12EayRS2V1kEsWESU9QMRseFhdxYxKicsiFmxrsLZHeLUtdps3S", "Action": "GET", "Offset": 0, "Size": 14080, "Remote Address": "109.61.92.86:41202"}
2024-11-08T13:48:36Z    INFO    piecestore      downloaded      {"Process": "storagenode", "Piece ID": "FEF6AZW2GBMVAORPXJC7H6VKTHNPXX5ARDYUOF2I6P743QXOSFXQ", "Satellite ID": "12EayRS2V1kEsWESU9QMRseFhdxYxKicsiFmxrsLZHeLUtdps3S", "Action": "GET", "Offset": 0, "Size": 14080, "Remote Address": "79.127.219.43:48014"}
2024-11-08T13:48:36Z    INFO    piecestore      downloaded      {"Process": "storagenode", "Piece ID": "FEF6AZW2GBMVAORPXJC7H6VKTHNPXX5ARDYUOF2I6P743QXOSFXQ", "Satellite ID": "12EayRS2V1kEsWESU9QMRseFhdxYxKicsiFmxrsLZHeLUtdps3S", "Action": "GET", "Offset": 0, "Size": 14080, "Remote Address": "79.127.219.43:48032"}
2024-11-08T13:48:36Z    INFO    piecestore      downloaded      {"Process": "storagenode", "Piece ID": "FEF6AZW2GBMVAORPXJC7H6VKTHNPXX5ARDYUOF2I6P743QXOSFXQ", "Satellite ID": "12EayRS2V1kEsWESU9QMRseFhdxYxKicsiFmxrsLZHeLUtdps3S", "Action": "GET", "Offset": 0, "Size": 14080, "Remote Address": "109.61.92.86:41330"}
2024-11-08T13:48:36Z    INFO    piecestore      downloaded      {"Process": "storagenode", "Piece ID": "FEF6AZW2GBMVAORPXJC7H6VKTHNPXX5ARDYUOF2I6P743QXOSFXQ", "Satellite ID": "12EayRS2V1kEsWESU9QMRseFhdxYxKicsiFmxrsLZHeLUtdps3S", "Action": "GET", "Offset": 0, "Size": 14080, "Remote Address": "109.61.92.86:41434"}
2024-11-08T13:48:36Z    INFO    piecestore      downloaded      {"Process": "storagenode", "Piece ID": "FEF6AZW2GBMVAORPXJC7H6VKTHNPXX5ARDYUOF2I6P743QXOSFXQ", "Satellite ID": "12EayRS2V1kEsWESU9QMRseFhdxYxKicsiFmxrsLZHeLUtdps3S", "Action": "GET", "Offset": 0, "Size": 14080, "Remote Address": "109.61.92.86:41372"}
2024-11-08T13:48:36Z    INFO    piecestore      downloaded      {"Process": "storagenode", "Piece ID": "FEF6AZW2GBMVAORPXJC7H6VKTHNPXX5ARDYUOF2I6P743QXOSFXQ", "Satellite ID": "12EayRS2V1kEsWESU9QMRseFhdxYxKicsiFmxrsLZHeLUtdps3S", "Action": "GET", "Offset": 0, "Size": 14080, "Remote Address": "109.61.92.86:41336"}
2024-11-08T13:48:36Z    INFO    piecestore      downloaded      {"Process": "storagenode", "Piece ID": "FEF6AZW2GBMVAORPXJC7H6VKTHNPXX5ARDYUOF2I6P743QXOSFXQ", "Satellite ID": "12EayRS2V1kEsWESU9QMRseFhdxYxKicsiFmxrsLZHeLUtdps3S", "Action": "GET", "Offset": 0, "Size": 14080, "Remote Address": "79.127.219.43:48058"}
2024-11-08T13:48:37Z    INFO    piecestore      downloaded      {"Process": "storagenode", "Piece ID": "FEF6AZW2GBMVAORPXJC7H6VKTHNPXX5ARDYUOF2I6P743QXOSFXQ", "Satellite ID": "12EayRS2V1kEsWESU9QMRseFhdxYxKicsiFmxrsLZHeLUtdps3S", "Action": "GET", "Offset": 0, "Size": 14080, "Remote Address": "79.127.219.43:48088"}
2024-11-08T13:48:37Z    INFO    piecestore      downloaded      {"Process": "storagenode", "Piece ID": "FEF6AZW2GBMVAORPXJC7H6VKTHNPXX5ARDYUOF2I6P743QXOSFXQ", "Satellite ID": "12EayRS2V1kEsWESU9QMRseFhdxYxKicsiFmxrsLZHeLUtdps3S", "Action": "GET", "Offset": 0, "Size": 14080, "Remote Address": "79.127.219.43:48114"}
2024-11-08T13:48:37Z    INFO    piecestore      downloaded      {"Process": "storagenode", "Piece ID": "FEF6AZW2GBMVAORPXJC7H6VKTHNPXX5ARDYUOF2I6P743QXOSFXQ", "Satellite ID": "12EayRS2V1kEsWESU9QMRseFhdxYxKicsiFmxrsLZHeLUtdps3S", "Action": "GET", "Offset": 0, "Size": 14080, "Remote Address": "79.127.219.43:48024"}
2024-11-08T13:48:37Z    INFO    piecestore      downloaded      {"Process": "storagenode", "Piece ID": "FEF6AZW2GBMVAORPXJC7H6VKTHNPXX5ARDYUOF2I6P743QXOSFXQ", "Satellite ID": "12EayRS2V1kEsWESU9QMRseFhdxYxKicsiFmxrsLZHeLUtdps3S", "Action": "GET", "Offset": 0, "Size": 14080, "Remote Address": "79.127.219.43:48106"}
2024-11-08T13:48:37Z    INFO    piecestore      downloaded      {"Process": "storagenode", "Piece ID": "FEF6AZW2GBMVAORPXJC7H6VKTHNPXX5ARDYUOF2I6P743QXOSFXQ", "Satellite ID": "12EayRS2V1kEsWESU9QMRseFhdxYxKicsiFmxrsLZHeLUtdps3S", "Action": "GET", "Offset": 0, "Size": 14080, "Remote Address": "79.127.219.43:44062"}
2024-11-08T13:48:37Z    INFO    piecestore      downloaded      {"Process": "storagenode", "Piece ID": "FEF6AZW2GBMVAORPXJC7H6VKTHNPXX5ARDYUOF2I6P743QXOSFXQ", "Satellite ID": "12EayRS2V1kEsWESU9QMRseFhdxYxKicsiFmxrsLZHeLUtdps3S", "Action": "GET", "Offset": 0, "Size": 14080, "Remote Address": "79.127.219.43:48230"}
2024-11-08T13:48:37Z    INFO    piecestore      downloaded      {"Process": "storagenode", "Piece ID": "FEF6AZW2GBMVAORPXJC7H6VKTHNPXX5ARDYUOF2I6P743QXOSFXQ", "Satellite ID": "12EayRS2V1kEsWESU9QMRseFhdxYxKicsiFmxrsLZHeLUtdps3S", "Action": "GET", "Offset": 0, "Size": 14080, "Remote Address": "79.127.219.43:47850"}
2024-11-08T13:48:38Z    INFO    piecestore      downloaded      {"Process": "storagenode", "Piece ID": "FEF6AZW2GBMVAORPXJC7H6VKTHNPXX5ARDYUOF2I6P743QXOSFXQ", "Satellite ID": "12EayRS2V1kEsWESU9QMRseFhdxYxKicsiFmxrsLZHeLUtdps3S", "Action": "GET", "Offset": 0, "Size": 14080, "Remote Address": "79.127.219.43:44112"}

The remote addresses vary, so they seem separate requests. Does this indeed mean the customer just downloaded the same file many, many times?

Edit: I found another example where a particular piece was successfully downloaded almost 60.000 times!

docker logs storj 2>&1 | grep "downloaded" | grep "6AIOWIHYKJXT5AL3ZFW5OXT6MO7XTDLC7KJ3FVQYYFLOMWUW7F6A" | wc -l
58387

That is possible. If a customer uses Storj to publically distribute popular files, you will see many requests from many different locations.

3 Likes

My logs showing those parallel downloads are the same piece but different parts/offsets.

From what I gather from the Engineers looking into some issues, there appears to be a large number of download requests which may be streaming requests. The cause isn’t clear to me right now. It could be a bunch of simulataneous requests coming in at once, or it could be a bunch of backed up requeats being released at once.

What I can tell you is that there is all hands on deck working to triage the situation and are working on solutions. At some point when we have more information we’ll inform the SNO’s with further details. It is rather fluid at the moment.

2 Likes

Just to be clear, getting lots of requests for one piece in a short amount of time is not necessarily an error condition. Rather, it’s nearly the ideal situation for a node: your node is likely to have the piece contents cached, plus you will get paid for every request, if your node wins the race. And with many requests going at the same time, even a potato node is likely to win a fair number of races.

That said, @Knowledge is right that there is unusual activity going on on our end, which may be contributing to this effect you are seeing.

8 Likes

I think you have this end-over-head…

“download” from the perspective of the node/logs is when the node downloads a piece and stores it, right?

…this is upload.

And I’m pretty sure it isn’t synthetic data: I GE’d from saltlake about a month and a half ago…

THAT makes sense! :slight_smile:

Well on the upside… it’s reads not writes right?
…my SSD can do that until we’re both dead of being old :smiley:

No. All wordings from the customers points of view. So downloads - these are an egress traffic from your node, uploads are an ingress to your node.

Depending on the direction of the traffic, if it is the egress traffic (downloads) - then these requests are reads, if the ingress traffic (uploads) - then these requests are writes.

1 Like