Bandwidth utilization comparison thread

SGC · July 19, 2020, 4:42pm

ah right… maybe it’s overloaded again…

last time it was like months behind with some of the data… i suppose that might be related…

interesting that more people see the spike tho…

@TheMightyGreek
it’s not ingress its data stored…
the graph claims i had an additional 5tb additional disk space used yesterday…

TheMightyGreek · July 19, 2020, 4:49pm

yes my phrasing wasn’t very clear

My graphs look very much like yours, saltlake all over the place and europe north goes up and down but nothing extreme.

SGC · July 19, 2020, 4:56pm

well haven’t really looked at it to much… was keeping an eye on it in the beginning… but then i noticed how all over the place it was and then i just sort of ignored it until now…

but i suppose it might just be the stress on some of the satellites that is doing it…

anyways just wanted to prove my point about why it’s almost a useless graph currently, now that there was the chance… and why it shouldn’t be used for anything… atleast until it works correctly…

and i got curious to know, if it was a local or a network thing…

BrightSilence · July 19, 2020, 6:50pm

This is correct.
But there is one caveat with new nodes. For every upload the satellite selects a number of vetted nodes as well as a much smaller number of unvetted nodes. Both of those processes use the same method except for the vetted/unvetted differentiation. Afterwards if there is any overlap of subnets between those two selections one of the nodes gets dropped.

If your subnet has vetted as well as unvetted nodes, you have a chance to get selected in either of those two processes, leading to a very slight increase of total traffic until the node is vetted. Luckily this increase is so small that it’s really not worth trying to cheat this system especially since that would only work until the other node is vetted.

dragonhogan · July 20, 2020, 3:19am

19Jul2020:
Node 1:

Node 2:

TheMightyGreek · July 20, 2020, 6:38am

here are the numbers for the 19th

170GB of ingress isn’t too bad ! haha

SGC · July 20, 2020, 7:01am

@TheMightyGreek still ahead of the pack… and it’s 180gb ingress for you and 158-160 for the rest of us…

shoofar · July 20, 2020, 7:44am

testing new text formatting - stolen from Alexey

Date	IngressT	EgressT [GB]	StoredT [TB]	egress ‰	egressT	EgressT [kb/s /TB]	Ingress1 [GB]	Egress1 [GB]	Stored1 [GB]	Egress1 ‰	Egress1 [kB/s]	Egress1 [kB/s /TB]	Ingress2 [GB]	Egress2 [GB]	Stored2 [GB]	Egress2 ‰	Egress2 [kB/s]	Egress2 [kB/s /TB]
15.07.2020	126.68	10.87	1.45	7.51	125.81	86.92	104.59	9.03	1 230	7.34	104.51	84.97	22.09	1.84	217	8.46	21.30	97.93
16.07.2020	125.09	10.96	1.57	6.98	126.89	80.84	102.43	9.18	1 330	6.90	106.28	79.91	22.66	1.78	240	7.43	20.60	85.96
17.07.2020	132.50	11.67	1.69	6.92	135.07	80.04	107.99	9.73	1 420	6.85	112.62	79.31	24.51	1.94	268	7.25	22.45	83.92
18.07.2020	141.28	12.30	1.84	6.69	142.38	77.38	73.88	8.19	1 500	5.46	94.81	63.21	67.4	4.11	340	12.09	47.57	139.91
19.07.2020	158.45	14.14	2.00	7.06	163.70	81.77	80.63	8.79	1 580	5.57	101.78	64.42	77.82	5.35	422	12.68	61.92	146.73

I have also noticed the increase in ingress…
“T” is for total “1” and “2” stand for nodes 1, nodes 2
Wanted to point out that my 10 days older node has much smaller egress than new one (you can compare it, when you scroll to the right)
Theories
1… More recent data is accessed more frequently?
2… That egress is “in fact” just “verification/confirmation info” for data received, and it is no indication how much egress will be generated when node fills up

SGC · July 20, 2020, 8:06am

i don’t like the kb/s
i mean whats the point… one will not utilize it to calculate space used, so what would one use it for… only thing i can come up with is how much internet bandwidth is used, because really daily ingress can easily be divided into the avg kb/s

so if we are only using it for comparing to internet bandwidth usage, then it would make more sense to simply write it in mbit/s so that people can more easily see if it’s exceeds their internet bandwidth.

then it atleast has a purpose.

i think the model striker43 uses is much more easy to read… even if he also uses the useless kb/s

i know people find a speed gauge very fascinating… i’m just like so, but the daily ingress is essentially the exact same number… it’s a fixed avg over a preset time…
so if we are trying to streamline the information kept in the, as there is plenty of useful information to write into such a list…

TheMightyGreek · July 20, 2020, 8:45am

I personally think that it’s better to put emphasis on the total numbers like in @shoofar 's model because at the end of the day that’s what we use to figure out the egress permille.
As for the kb/s, it might be useful for people with very limited bandwidth to see how much head room they have but otherwise it’s pretty much an aesthetic feature.
Also putting it in mbps might be better as it’s pretty much the standard to measure internet speed.

shoofar · July 20, 2020, 9:10am

I use it to compare my average daily from previous days to current day. To see if it changes, and it is also a middle number used in calulation of egress kB/s / TB
I can say also that per each 100GB of ingress, our egress is likely to rise by X amount. (which is very small, for now I hope)

Unfortunately this number is highly unreliable in such comparison,
As I observe minute ingress it has some of spikes of up to 5-9MB in one second and 0 kB/s for another batch of seconds.
Here is an example of quiet “Stable ingress” in one minute - each column is one second.

There is a lot of spikes up to ~5000kB and then like a zebra lots of zeros.
And this traffic is only port monitoring used by storagenodes.

or as you can see here
Max is 8200kB but most of the traffic is lower giving average for that minute 1800kB

So it is, in my opinion, important to have at least 64+ Mbit downstream (to catch the spikes), unless we can do a test with limiting max downstream to lets say 30Mbits and see if this traffic will get evened out (something in lieu of sattelites will keep sending data until our node gets all of parts ) but I don’t think that this idea works that way.

stefanbenten · July 20, 2020, 6:37pm

The issue here is the frequency at which it reports the values. The time it takes to get those numbers increases with its size. As you might know Saltlake and europe-north-1 are by far the biggest two satellites in the network. For the later you can also see the graph to fluctuate.
In terms of this graph specifically, i tend to agree that its not as useful as it could be. In my mind, it made more sense to take the local diskspace for this calculation and just “cross check” with the satellite.
To give this “bad graph” more weight to be fixed, i can recommend opening github issues (https://github.com/storj/storj/issues)

SGC · July 20, 2020, 6:53pm

yeah i usually take the total disk space’s used number, but that was where i think i caught the inconsistency between ingress and used…
it didn’t directly correlate, however i will have to go check that… and i will do that…
hodl my beer

and on the graph note… well just smooth it out… make it so it display the avg instead, problem solved… ofc that doesn’t solve the underlying issue, which may be a sign of something else being wrong… like say satellite workload, latency or simply to much cpu time required to keep up… i duno… but just thinking out loud… if one avg the graph for a quick fix… then this wouldn’t serve as an indicator for whatever is causing the problem…

so there is that…

going to post it on the trash comparison thread, because it’s more relevant to that…

stefanbenten · July 20, 2020, 9:11pm

I does not directly correlate, as explained here:

Do not focus on the math between ingress and data stored increase. It will always be off by the amount of data that was transferred but then canceled (storing partial pieces does not make sense).

dragonhogan · July 21, 2020, 12:18am

20Jul2020:
Node 1:

Node 2:

TheMightyGreek · July 21, 2020, 6:34am

Ingress went down a bit but still way above average.
Egress went up by 10% which is pretty amazing !

SGC · July 21, 2020, 7:51am

@stefanbenten
it’s kinda interesting that themightygeek’s disk space used this month graph doesn’t have the same inconsistency like my node and dragonhogans has…

like that the more data a node has the effect of whatever is causing it is amplified.

stefanbenten · July 21, 2020, 8:18am

I mean that totally makes sense, due to normal deletes
Do not forget, it’s not data that will live there forever. This is user data, that can be deleted at any time. If you hold more data, the chance of piece deletion from your node is obviously higher!

SGC · July 21, 2020, 8:25am

heheh… well then somebody deleted 7TB on my 12TB node and uploaded it again before the day was over… yeah i don’t think that was how it went down…

oh yeah and we checked it happens across multiple nodes… it seems that the more data the node has the larger swings will be created in the graph.

the highest workload satellites also seems to be the most affected, not sure what it means aside from that it’s the satellites causing the deviations in the graph

also i can check my deletions for each day pretty easily i put that into my log system as a feature

just look at that… goes from just below 250 to just below 400tb*h

stefanbenten · July 21, 2020, 8:48am

It seems like you mix a couple of things up here.
The TB*hour value is not coming from your node directly.

We were talking about actual disk usage and ingress amount before, which my information is valid for.

In terms of the graph variance, this is simply coming from the time it takes bigger satellites to sum up the amount of TB*hours. The more data it handles, the longer the iteration over it takes.