Bandwidth utilization comparison thread

BrightSilence · July 28, 2020, 7:44am

This is just the current implementation to ensure distribution. I’m sure it will be changed/improved at some point. There already aren’t really free options to get more than one IP. And most of the paid ones will introduce more hops, which isn’t exactly good for the node as it will most likely be slow to respond. The exception is if you actually have multiple ISPs. Which is probably also the most expensive solution.

As for the traffic, I wouldn’t worry about it. It’ll be back. You can still use these slower moments to move some nodes around to a more graceful setup. Especially since down time isn’t currently used to disqualify.

SGC · July 28, 2020, 8:05am

i don’t think test data will stop, they will keep cycling it around so long as the network is alive, just to make sure everything works and data isn’t lost…

yes ingress is unstable… my node is nearly 5months, and i just passed 13TB but got a good kick when i spun up my node, since it was just after the reset and during some sort of stress test…

the avg has been about 2tb pr month of ingress… lately… but there has been some issues and they are most likely testing the network… uploading and stopping it to collect data on how the network is behaving with customer data network utilization…

or they are simply having issues with their bandwidth, its a lot of data they need to move… generate and what not… analyze, verify for data errors…collect network data on…

ofc another issue with ingress is that if we have just 5% running 10 subnet’s then they would be able to soak up 1/3 of the total data allocated for everybody…

so really the subnet allocation thing is going to bite us all in the ass, it just becomes a subnet arms race that is pointless and wasteful, sure a redundant internet connection might be useful… but aside from that,

it’s bad for storj, its bad for the regular SNO, it’s bad for the data reliability… and it gives diminishing returns even tho the SNO’s that run 10 x IP/24…
here atleast it would be maybe 4 times as expensive for me if i wanted 10 different subnets internet connections / setup
and right now that costs is about the same as my power usage, which i just barely break even on if i count the held amount…

so 2 or 3 x my ingress and egress with only internet costs, then i would be well into the positive, easy… but why would i do that when it’s not allowed according to the Terms of service and i doubt it will be allowed due to the detrimental effect on the network…

only issue is how to avoid it…

shoofar · July 28, 2020, 9:17am

Whoooo Hooo!!! Man!!!
CONGRATZ!!!
And your numbers are nice too!!
38.5 kb/s per TB stored…
WOW!!!
I am getting just 25… but my node is like month old now time flies, when you learn new things about linux hahaha

My numbers:

Date	IngressT	EgressT [GB]	StoredT [GB]	egress ‰	egressT	EgressT [kb/s /TB]
25.07.2020	9.40	3.67	1 842	1.99	42.49	23.07
26.07.2020	14.43	4.00	1 850	2.16	46.35	25.05
27.07.2020	13.41	4.03	1 870	2.15	46.60	24.92

SGC · July 28, 2020, 9:25am

the first month’s numbers can be a bit weird… also like you saw with kevink’s Node a while back egress can be very chaotic

SGC · July 28, 2020, 9:36am

also seems like there was a data spike for a brief period for my node… so not easy to to compare with that… unless if you compare the spikes… usually i prefer to take 10 - 100 data points and then get rid of the extremes and calculate the avg, to get accurate data correlation and since we only has one data point pr day, then it would have to be tracked, calculated and compared at a weekly if not monthly time scale for the sake of getting an actuate idea about what the real world numbers are for each SNO in a simple way.

maybe thats an idea for my next summery … which i really should get working on… xD
i think ill do that for the Aug 11th then i can trash the first week of data and take a full month worth, do a semi useful egress analyzes…

shoofar · July 28, 2020, 9:43am

Yeah it could be useful to have even the cheapest slowest one to switch to it when primary connection just fails and one doesn’t want to get DQ’ed (i know that this feature is not enabled now)

yeah it is not useful and potentially harmful and misleading during tests…
On the other hand, I can see how this idea could come into being - “calculator showed someone sooooo optimistic calculations, now one can try to achieve it by any means that can be thinked of” - another potentially harmful feature of too optimistic calculator page

BTW, anyone knows where I can find a roadmap of STORj ? I accidentally stumbled upon it in one of the posts, but can’t find it again - there was info about upgrade timetable - just wanted to see if I will be around that time so it won’t fail on me when I’m out of town…

SGC · July 28, 2020, 10:15am

even a 25/10 mbit would basically double profits, at well what does internet that speed cost… looks like the lowest i can get is 30$ - 300mbit /60mbit, but my node already earns 40$ and barely even at my designated half point mark, so i would double my profit but i would only add about 1/3 in monthly expenses, hardware excluded since it’s a more static cost.

besides the power is usually the main price when looking at years of usage… so why bother look into that to much… ofc HDD’s alone doesn’t take a lot of power…and is kinda expensive… so their power usage vs initial purchase price is a bit different rather than the full server / controller for the node.

but if we are comparing two new nodes, on different subnets on the same server…
vs 1 internet, assuming all hardware is equal in the math…

so for 30$ i could earn 10$ more a month…ofc now that i’m over half full with my current drives… then it’s less overhead… but still would have been nice if the node was full already.

ofc my next upgrade will most likely be 4x14tb or such drives, because i need to expand the pool in a good way… but then i run into lack of bays, but seems like my old onboard sata controller comes to the rescue and supports max density hdd’s even tho it’s 10 years old… with that i might just barely make it work, especially if i buy a ioaccelerator card thus moving the ssd l2arc on to the pcie bus.

then make a new pool on the 4x14tb… almost looks like a viable plan… was sure i would need to get a Disk shelf… just a good deal of money to put into hdd’s in one go but will give me an additional 42 TB of actual storage capacity + i will be able to restructure my “old” pool that i just completed like a month ago, into the new setup and thus gain more free space for little added risk.

had done the math the wrong way… figured 3 drives would be fine, but then i got to going over the numbers again… tho a nice mark and very safe, the 4th drive is really the optimal solution, because of the gain of 50% capacity from 3 to 4 drives with raidz1… ofc when expanding the pool one ends up expanding by 4 drives at a time… had done the math the other way around… looking at how much lost space i had… which is only 33% on a raidz1 with 3 drives… but thats irrelevant since it’s really the pool size compared to the price of the setup that matters…

and thus adding 50% more space for 1/3 extra cost 3 drives +1 is a good bargain…
ofc costs 1/3 more each time one upgrades and it makes everything 1/3 more difficult to work with…

but since there is diminishing returns i think i’m going to base my entire setup on 4 drive raidz1’s until i get wiser lol

and doing 4 drive would give me 9TB extra free capacity out of my current setup taking it to 33+42 so after my next upgrade i will be at 75 TB … almost seems to much of a jump… but i suppose i could look into renting space out to other services.
and that would be without redundancy… the actual space would be 100tb dedicated for projects such as storj

shoofar · July 28, 2020, 10:34am

Hmm… that’s interesting what you are saying - about steps that need to be an increment of 4x of drives. - Unless you are planning to grow the speed of writing? (which is mostly unused with ingress at 150GB/day cap - as it only translates to ~1.5-2 MB/s what is a “sneeze” for a drive nowadays)

I was considering just option that the first creation of raid needs 3 or 4 drives (Raid5 or Raid6)
Then you can add one disk as needed diluting the “lost space”
I was planning (for example) to create 1 raid5 starting with 3 and adding up to 6-8 disks in total growing it as new space will be needed.
But with 120-150GB/day cap it will take ages to fill it up… (7x14TB = 98TB + 1 spare) it will take ~ 653 days to fill it up with constant 150GB of ingress what doesn’t happen at least for now.

I have an idea of big discrepancies between size of nodes. Some “big” nodes (10+ TB) can also have impact on STORj network. Especially with idea that people “from the street” can join the network (as the main idea of STORj)
Assumption of everyone getting the same amount of ingress is a culprit (imho) - meaning one small node can be filled in just couple of days, and another (40+TB) node will take months to fill.

Essentially it comes to this (for me) - that if ingress remains constant for all nodes, it can cause massive fails in availability of free nodes as some nodes (especially those smaller ones) will get filled pretty quickly in ~10 days of such “massive” (in comparison to their size) ingress.

And I am aware that what I am saying is overly simplified (not taking into account amount of nodes)

SGC · July 28, 2020, 7:15pm

i’m running raidz1 which is basically raid5 with some extras’ raid5 one should avoid is at all possible or doing long term storage… but for a short time like a year or two it might be an okay solution, but raid5 cannot be recommended long term… because it only has 1 redundant data point and no checksums… so it cannot distinguish which data is damaged, also raid 5 has IOPS like 1 drive because all the drives in the raid is in sync, when i run a 3x raidz1 of 3 drives each (want to move to 4) then i get IOPS like 3 hdd’s
and IOPS is very important for random read writes… sure i get faster read and write speeds, but already now i’m at the point with read write that i could close to max a 10gbit network connection…

i also run other stuff on my POOL which is why i also want more IOPS, i don’t just want a pool that i can use for storj only i want a data storage i can run whatever i want to have access to basically unlimited space… all my vm’s virtual hdd’s run off the pool and its ssd write cached.

14tb is about where you get most storage for the least prices, its to reduce my power consumption pr TB data storage, and if i want to have the advantages of running multiple raidz1 with 4 drives then i have to add them 4 drives at a time… sadly… but that’s survivable trade off for the advantages of ZFS.

well we have also seen weeks of 5mb/s ingress… when those times comes around, space quickly dwindle… but yeah with 75TB for storj i might be past the point where the deletions a month exceeds ingress and thus the node will stop growing… but who knows… not many single nodes with more space around…

KernelPanick · July 28, 2020, 7:41pm

One could run some type of filtering based on the tracert (trace route) results from the satellite. While it could be pretty easy to buy IPs outside of a /24 subnet, maybe you cant fool a full route as easily.

SGC · July 28, 2020, 8:11pm

uh uh uh and maybe used that google internet algorithm that maps networks for network route caching or whatever… pretty sure i got that running on my server, to optimize my network traffic, tho not sure if it actually works or is just eating all my ram tho… lol

Mark · July 29, 2020, 2:45am

8 GB today. I wonder if this is zero test data or reduced test data. It’s nice to see what actual storj network activity might look like.

TheMightyGreek · July 29, 2020, 8:33am

Looks like europe-north is still pulling data from my node but close to zero ingress. Stefan-benten is inactive and I’m only getting repair ingress from saltlake. I think those are the “only” test satellites. The other 3 are for customer data so that would be normal traffic.
Let’s say that the numbers without test data are pretty pitiful for the moment…
Hope storj manages to get some decent customers in the future otherwise they’ll have to keep generating test traffic to keep SNOs happy.

shoofar · July 29, 2020, 8:34am

Nice Mark!
You are teh winner in todays numbers lottery
Your Egress is awesome comparing to mine. Even though you have 2 times less data than me haha
You Egress average per stored TB is whopping 58kb/s!!
I have 2 x bigger node and 2 x less egress haha

Can I ask you a question? - Why is your bandwidth utilization chart so spikey? - mine would be almost very evenly distributed if i wouldn’t play with fire on those valleys days…

SGC · July 29, 2020, 8:40am

yeah me to… but i think it’s no test data, i mean this seemed to start after the production launch… duno about before that… wasn’t around for more than a 1½ or so… but after production, it started to cycle…

and how better to keep test data running for monitoring the nodes and data reliability while still keeping an eye on the natural behavior of the network traffic.

but i could be wrong twelve times to sunday, right time for the screen limbo

@shoofar mark’s node might not be fully vetted yet or something, since his bandwidth usage looks kinda weird… but unvetted nodes looks vastly different if memory serves…

oh or his node is full only 30gb free… so most likely that which caps off the bandwidth usage

shoofar · July 29, 2020, 9:00am

Yeah, maybe Mark was stretching the available space 3 times that’s why he has 3 spikes - each up to 120GB - meaning - he got one day of full transfer 3 times.
Or he got so many deletes

Anyway awesome numbers for you too - your egress grew from 38 to 39.7 cool!

SGC · July 29, 2020, 9:08am

well if something got deleted, there would be ingress… for a little while… which would be my guess…

i stopped rebooting my server again… running a few experiments to see if uptime may have an effect on something… people say no, but it’s a complex system, not easy to estimate such things… also my network algorithm and l2arc and such will improve system performance and network performance over time… also the node finishes all it’s boot work and my server gets into an optimal period… but what about long term… well my l2arc takes like a month or two to really be finish training, if it ever finishes… but after that time improvements will be little and far between…

and i assume the same holds true for the network algorithm i’m running, but i really should get the one i got upgraded… since it’s only the second best apparently, but it was the only one i could seem to get to work when tinkering with network settings.

ofc all that may not have any meaning at all… but then on the other hand… it might still give a significant edge… difficult to say… ill do an egress analyzes at the 11-12th or so… then we have a 1month of data on the thread… if i trash the first week which is kinda terrible data

Mark · July 29, 2020, 2:39pm

I change my storage settings sometimes. Some days my node was “full” and some days I increase storage and allow more ingress. My hard drive is only 2TB and I want to save some room for later. When I post screen shots it had free space available for the entire 24 hours , so ingress should be accurate for that day. Since the test data has slowed down I’ll probably give the node some more storage so it can continue to fill without interruption. Edit: I can only give it another 500GB and still have 10% free space. I think that might last about a month with the current amount of usage and repair coming in. It’s difficult to factor in the effect of deletes since it’s not listed on the dashboard. I guess I can just track the disk “used” or “free” space each day and see how they change and extrapolate.

kalloritis · July 29, 2020, 4:14pm

Pretty large drop off on ingress over the past week or so. Did someone stop uploading test data or did alpha.transfer.sh crash?

3mo Fiber node

2mo Coax node

TheMightyGreek · July 29, 2020, 4:24pm

Storj stopped uploading test data