I have several nodes but one sucks

Celizior · October 26, 2024, 9:51am

Hello there,

I operate nodes for a long time then added more of them on different public /24. I use zabbix to monitor there growth and noticed that there is 3 groups.

The big performer, growing from 15 to 20GB per days (average on the last 30 days)
The average performer, 6 to 9GB per days
And the bad one, 2GB per days

It has very good stats, like the others. Online stats on different satellite go from 99.78% to 99.97%

Neighbors tell me that it is alone
Pingdom tell me that it ping corrrectly with TCP and QUIC. It’s latency is a little bit higher than the others

started
via: St.Petersburg, Russia
QUIC: dialed node in 238ms
QUIC: pinged node in 131ms
QUIC: total: 369ms
TCP: dialed node in 392ms
TCP: pinged node in 130ms
TCP: total: 521ms
via: France (proxy, no QUIC)
TCP: dialed node in 400ms
TCP: pinged node in 122ms
TCP: total: 522ms
done.

But it’s the only difference
Do you have any idea what’s wrong with it ?

DisaSoft · October 26, 2024, 11:58am

Welcome to our club
https://forum.storj.io/t/4-nodes-on-one-server-but-one-node-is-getting-50-of-the-data-the-other-nodes-are-getting/28230

How your traffic differences are distributed among satellites? Some of my nodes I mentioned in topic above has the same traffic from US satellite, but very different (7.5 times) from EU.

Celizior · October 26, 2024, 12:56pm

Oh, I didn’t saw this topic !

On the bad node, I don’t see the ingress, but storage last month is
total 1.15TBm
saltlake 156.29GBm (13.5%)
ap1 16GBm (1.4%)
us1 0.77TBm (66.9%)
eu1 202.64GBm (17.5%)

On a good node
total 5.83TBm
saltlake 543.7GBm (9.3%)
ap1 88.53GBm (1.5%)
us1 3.78TBm (64.8%)
eu1 1.42TBm (24.3%)

I made this script a long time ago How to check vetting state from API (PowerShell)
which give me

Satellite                   totalCount onlineCount Good    
---------                   ---------- ----------- ----    
saltlake.tardigrade.io:7777       7378        7366 Not good
ap1.storj.io:7777                 8957        8943 Not good
us1.storj.io:7777                 7915        7902 Not good
eu1.storj.io:7777                 8781        8777 Not good


totalCount : 33031
onlineCount : 32988

“not good” but the stat seems still nice, success rate is 99,86%
An other good one has the lower success rate 99,69%

Ps : I just saw how to have a look on ingress per satellite

On the bad node
total 0.86TB
saltlake 1.68GB
ap1 31.57GB
us1 463.74GB
eu1 362.11GB

On a good node
total 2.28TB
saltlake 1.65GB
ap1 60.08GB
us1 1.23TB
eu1 0.99TB

Knowledge · October 26, 2024, 1:35pm

If the slow one is in Russia, the location is likely the issue. Some customers may exclude Russia and other countries. Also, Russian IPs may be under sanctions and you won’t get paid for those nodes if they are.

Celizior · October 26, 2024, 1:42pm

Most of my nodes are in UK but this one is in Spain. It’s the only difference, I don’t think it would change something

Vasabi · October 26, 2024, 3:08pm

Интересно и много пользователей не хотят хранить данные в России?

DisaSoft · October 26, 2024, 8:52pm

In my case 2 absolutely similar nodes working at the same physical server and sharing the same IP address (in EU) has 7.5:1 ingress ratio from EU satellite during 2 month since SLC tests https://forum.storj.io/t/4-nodes-on-one-server-but-one-node-is-getting-50-of-the-data-the-other-nodes-are-getting/28230/5?u=disasoft
Author of that topic has the same situation in GB.

Knowledge · October 26, 2024, 9:28pm

I would guess that the slower nodes have more long tail cancelations than the fast ones. You could parse the log and look for trends. I think over a long enough time period if all else is equal they will eventually balance out. If not, there is likely something amiss causing more cancelations on one than the other/s.

ShareIT · October 27, 2024, 8:26am

You can look at cancellations here.

The bad node receives only 1/3 of the traffic from the EU satellite, while it receives exactly the same ingress from the US satellite. For some reason, the EU satellite hates this node.

It’s on the same hardware, IP, and everything else. The physical disk is even the same model (but a different actual disk). It has never sorted itself out.

Alexey · October 27, 2024, 8:28am

The traffic comes from the customers directly, not from the satellite, it doesn’t proxy their traffic.

I do not think so, unless the customers reported a low success rate for it.