Nodes in the same machine with different ingress

I have 3 nodes running in a single Synology. 2 of them share the same disk, the other in a different disk. Don’t ask why, it’s about legacy.
Ever since all nodes are vetted, they get the same ingress.
Since ~1 week ago, one of the nodes (the oldest one) is getting almost 3 times the ingress of each of the other 2.
Also, the ingress of a single node in another location used to correlate with these 3 nodes. It used to be the same as the 3 nodes combined. It used to make sense. Now it doesn’t correlate in anyway. And both sites have no storj neighbours.
Can someone explain this?

hey @humbfig,
could you provide some details about your nodes, like:

  • ids of affected nodes
  • info if ingress is different for all satellites or maybe one specific
  • any additional details you think is important

Node IDs and other sensitive data can be send by DM or by creating support ticket. Thanks in advance.

1 Like

Here is a fact-table. This is about ingress only. Apparently, things started to not make sense on the 15th of July and only for the US1 satellite.

Expected behaviour:
i) Node 4 = Node 1 + Node 2 + Node 3
ii) Node 1 = Node 2 = Node 3

The “expected” happened since all 4 nodes were vetted.

Apparently, what doesn’t make sense is nodes 2 and 3 getting little download, because if nodes 2 and 3 were getting the same download as node 1 (as they should), then Expected behaviour(i) would be right.
PS- One could also notice that on EU1, the ingress on the node1 is somewhat higher than in the rest of the “parallel” nodes (2 and 3) since the 15th. Being the only one running tha v1.81.3, it makes you wonder…

Other info: Node 1 is the only one running v1.81.3. All other nodes are running v1.82.1. I can’t say when each node was upgraded.

Node ids will be DMd. I didn’t know that was sensitive information…

also i have this issue. On 22 nodes only the nodes with 1.81.3 have identical ingress, others on 1.82.1 is more down

1 Like

1.82.1 fixed an ingress reporting issue. storagenode/piecestore: fix ingress graph skewed by larger signed orders · storj/storj@b6026b9 · GitHub

1 Like

That would settle it…
I just have to wait until node1 upgrades to v1.82.1 for the world to start making sense again…

PS- Actually not. Node4 is on v.1.82.1, so, that wouldn’t explain Expected behaviour(i)… if node4 was upgraded to v1.82.1 before the 18th, which I think it was…
If there is a problem with 1.81.3 reporting higher ingress than it should, then Expected Behaviour should be:
i) node 2 = node3 (check!)
ii) node4 = 3 X node2 (or node3) (No check!)

I imagine node 2 and 3 are on the same disk. Is the disk OK? No baddies, no errors etc? Are the db-es OK?

I also wonder if there might be any performance differences between your nodes.

You can try comparing just the number of upload attempts—satellites consider each piece as equal for node selection, regardless of the piece size. Looking at just the number of times your node was selected as an upload targets reduces variance of the comparison, but most importantly here disables the impact of failed uploads.

You can either parse your node logs for the number of upload attempts, or, if you have debugging enabled, look at the upload_started_count counter (side note: it would be nice if this counter was per satellite, and even better if it was a rate meter!).

If these numbers will be equal, then we’ll know the difference comes from failed/canceled uploads.

That would explain a lot. But no. Nodes 1 and 2 are on the same disk. Node3 is alone in one disk that serves no other purpose. Not a single bad sector in any of the disks, ever.

Well, you tell me. I see no performance difference. And anyway, before the July 15th, the world made sense…

node1

node2

node3

node4

Also, some of the numbers you point out, I don’t know how to get them. Debug is not enabled.

1 Like

Indeed, this looks a bit weird. Can’t help more than that, sorry, but this probably should be useful to Storj engineers.

do you mean upload to the nodes (ingress) or download from the nodes (egress)?
We always use the customer’s point of view, so need clarification.

From the first post I get that he reffers to ingress.

I hope so. Because downloads = egress and egress can vary.

I’ll ask the obvious question: are any of them full?
Guessing that the answear is NO, I’ll stop all of them, rm them, restart the machines, update all the packages and DSM, manual update the storagenode image on all, start the nodes, just to get them on the same version, and the software up-to-date. The last dsm and docker are working perfectly with storj, so no worries.

I mean ingress!
This POV thing is very confusing…

1 Like

None of the nodes are full. DSM is the latest version. Packages, including docker, are all updated. Node1 updated to v1.82.1 last night. So, all nodes on the same version.

PS- I’ll be offline doing IRL stuff until Monday or Tuesday. I’ll post my new ingress statistics when I get back.

Hi!
Just to say that the world makes sense again.
In the meanwhile all nodes were the same version (now, not anymore) and numbers started to add up.
Thank you all for your help.

3 Likes