Bandwidth utilization comparison thread

JoshGarza · August 14, 2020, 8:31am

LOL. u must be kidding. Let’s stop this offtopic here.

SGC · August 14, 2020, 8:41am

i would argue that it almost is indirectly on topic… since multiple IP/24 SNO’s
is essentially eating everybody elses bandwidth, and it is essentially a form of gaming the system… even if in some cases it’s fair…

i mean i can see why people would want two individual connections to a big setup… which is sort of fair… might be a bit unfair that they get double the data… but if one wants to be redundant and do a datacenter like setup with redundant everything… then internet is no different… ofc that then opens the question… how often would a consumer dual internet connection actually be protected against a local fiber being dug up… or such…

so from that perspective it’s almost pointless… ofc ISP’s can be unreliable or use other technologies for supplying inter, which would make it work…

in addition to that some SNO’s might have multiple locations where they have setup storagenodes, which i also think is okay, and gain that leads to more internet connections and possibly multiple ip/24

so then many SNO’s might actually have the potential for multiple ip/24, which raises the question… how does one protect against abuse, of the data just being all routed to one big server to save on expenses and get an insane ROI

kalloritis · August 14, 2020, 2:17pm

Has anyone else started to notice that repair egress has really taken a hike up in the past week or so?

Fiber node (4mo):

Coax node (3mo):

TheMightyGreek · August 14, 2020, 2:57pm

My repair egress has more than tripled since august 1st, node size only increased by about 20% due to the slow ingress.

SGC · August 14, 2020, 3:33pm

i really should get around to looking at making that summery for the month…

kalloritis · August 14, 2020, 3:43pm

Is there not currently a python or powershell script that does this?

I mean how does ReneSmeekes’ storj_earnings and Storj3Monitor do it? There’s got to be some way to query out each day’s details and then be able to use something like Vue.js, or even just pandas in python, to create a kinda export of it.

BrightSilence · August 14, 2020, 4:02pm

I pull the data from the sqlite db’s for the earnings calculator. You can also use the dashboard API.
For the db’s you’d mostly use bandwidth.db and storage_usage.db. Though calculating how much space was used historically might be a bit tricky due to how that information is stored.

SGC · August 14, 2020, 4:09pm

yeah i think when i do this summery, i will be inspired to fix the issue and make something that will grab maybe a months worth of data or something…
so people can just post it very infrequently and still get the same granularity on the data.

been working on moving the storagenode on docker into a container with access to my host pool where the storagenode data is …
and now everything works, or i think it does… but i haven’t dared to try it yet because there was some permission issues with it at first and i’m a bit worried about compromising my nearly 14 tb node…

might make a new 500gb node just to test it out… or i’m very sure i will… still stumbling around linux and constantly finding new things i need to learn to setup stuff like i want to…

so yeah… getting to it… eventually

kalloritis · August 14, 2020, 4:30pm

Figured that one out- its the thing of privileged vs unprivileged LXC container; only the priv one will support non-777 permissions being used when forwarding through a ZFS dataset as a mount point.

SGC · August 14, 2020, 6:17pm

actually you can remap the uid and gid i think it’s called…
doing that you can fix the permissions… ofc that does introduce some additional security holes, but can’t really get much worse than what i’m already doing when running it directly on the host…

or thats how i understood it from reading abit about it… i’m sure it’s wrong in more ways than i want to count… but it werks… i think… haven’t exactly done a ton of testing on it yet
just copied a script from it… but i have my zfs permissions setup in that way where the owner gets the rights on stuff they create and then others get read only access… and ofc the lower tier users cannot read root owned… i think … something like that… was setting up some network shares and had trouble so i know i did a good deal of tinkering to get that to run as i wanted…

seems to work well for most stuff… anyways on this link there is a guide to how to make lxc containers access host storage… not sure if its applicable outside of proxmox…
i did these changes…

First the file  `/etc/subuid`  (we allow 1 piece of uid starting from 1005):
root:1005:1
then  `/etc/subgid` :
root:1005:1

https://pve.proxmox.com/wiki/Unprivileged_LXC_containers

tried to add this to the .conf

# uid map: from uid 0 map 1005 uids (in the ct) to the range starting 100000 (on the host), so 0..1004 (ct) → 100000..101004 (host)
lxc.idmap = u 0 100000 1005
lxc.idmap = g 0 100000 1005
# we map 1 uid starting from uid 1005 onto 1005, so 1005 → 1005
lxc.idmap = u 1005 1005 1
lxc.idmap = g 1005 1005 1
# we map the rest of 65535 from 1006 upto 101006, so 1006..65535 → 101006..165535
lxc.idmap = u 1006 101006 64530
lxc.idmap = g 1006 101006 64530

but that made the container not boot, so i removed it and the other changes seemed enough to make it work… i duno if they actually do anything… had also an extra space in the mount configuration in the .conf file

used this one

mp0: /mnt/bindmounts/shared,mp=/shared

another thing i did tho in the proxmox webgui, was going into the container tab
selecting options: selecting features and clicking the edit button in the top…
then adding nesting and keyctl.

you can however add those atleast for proxmox lxc, in the .conf
looks like this when i check it.

features: keyctl=1,nesting=1

in case you wanted to try and get past the 777 issue

my full LXC .conf file ended up like this

arch: amd64
cores: 4
features: keyctl=1,nesting=1
hostname: CT303
memory: 4096
mp0: /nexus/publicshare,mp=/mnt/publicshare
net0: name=eth0,bridge=vmbr0,firewall=1,hwaddr=DE:41:4a:14:4B:EE,ip=dhcp,type=veth
ostype: ubuntu
rootfs: tank_vms:subvol-303-disk-0,size=4T
swap: 512
unprivileged: 1

oh yeah i forgot to mention some of these settings was to get docker running inside the container also… but i’m sure you know that
and i duno why it works… magic i guess lol
i just copy stuff…

dragonhogan · August 15, 2020, 12:21am

Another good day of egress.
14Aug2020:
Node 1:

Node 2:

kalloritis · August 15, 2020, 2:17am

Node one for you saw a behemoth amount of repair egress- I hope that doesn’t mean we’re seeing some churn of older nodes that have been holding onto some of that data and are now weakening the network.

Fiber node (4mo):

Coax node (3mo):

callmestu · August 15, 2020, 2:40am

That’s a possibility - I’m focusing on the Filecoin “Space Race” recently. I wonder if that’s a draw for folks. It is good for those with large data repos. Even the 100GB/day upload illustrated above is 1/10th of what I was hoping to achieve in a few months of maintaining these boxes. I’m relatively bullish on cloud storage (I mean, my minor in grad school is big data) but we seem to be handling mostly test data, and that is too bad. Customers - where are you?!

kalloritis · August 15, 2020, 4:08am

Well based on some of the anecdotal notations of a few SNO’s that have also submitted their own testing data to tardigrade- there’s a small bevvy of limitations that needs to be solved for before Storj can start to approach even Sia (600-860TiB of 2.2PiB) or Backblaze (multi-Pebibyte) utilization levels.

The one that strikes me, top of mind, is that 1TB is “a lot” as a single customer- which is just not true. At the office, I could dump about 20-40TiB of data to offsite storage and just write that as DR cost, but if I’m kneecap’d at ~1TB, and a bandwidth limit to boot, then I wouldn’t even begin to trial it. For example, a single endpoint’s 3mo of file backups, with dedup&compression, is still ~1.2TiB- before I even start getting to the 1.7TiB of image backups (developement/Graphics design endpoint).

BrightSilence · August 15, 2020, 8:49am

You can request a limit increase. It’s only there so Storj Labs gets a heads up if a customer suddenly wants to upload PB’s of data.

I have no idea what makes you say that. This is a quote from April.

Storj is several times larger than Sia already. Probably an order of magnitude larger.

TheMightyGreek · August 15, 2020, 9:29am

Node churn is actually a big problem, especially if disqualifications are frequent during GE.

Alexey did a breakdown of what it costs them to repair data when a node gets disqualified and it’s astronomical !!
I’m actually worried that it might cause BIG problems if many nodes exit the network (not gracefully) in a short period of time. I really hope that they are actively working on a solution (should be one of the top priorities) because at the moment the held amount only covers a fraction of the repair cost.
There is a mention about repair being distributed across the nodes in the whitepaper but I guess that they haven’t gotten around to it yet. That would greatly increase the decentralization of the network as well as reduce the cost of rented servers.

Anyway that’s just my 2 cents on what Storj should work on next. I think @SGC infected me with the “long post with lots of digressions” virus haha

BrightSilence · August 15, 2020, 9:39am

They are. The biggest reasons for disqualification that can be prevented are:

Nodes losing access to storage >> https://review.dev.storj.io/c/storj/storj/+/2272
Nodes failing graceful exit >> Several were fixed already, others are being actively worked on. https://forum.storj.io/t/reproduction-and-solving-issue-with-ge/8377

Furthermore, the repair threshold is currently set very high, partially to test that repair works as intended. But this also leads to much more repair than needed.

SGC · August 15, 2020, 10:08am

i had this happen today… caught it because the color awk log script was running in a window… i really need to setup so alerts lol

shoofar · August 15, 2020, 3:14pm

Well… that’s kind of idea behind “every Joe” having a node isn’t it?
The horrible idea comes from tardigrade right? with their quote “make a node for fun and profit?”

And as for me and my experience of my node fragility- I am not too keen to shell out spare hundreds of dollars just to make Tardigrade business happy… With ridiculously low egresses? (in my country hundreds of dollars don’t come too easy…)
Unless one is a philanthrope and wants to fund their business… or has own agenda for spare hardware if this project won’t work out.
So I would say - For one’s home-made business expect home-made effects and unfortunately home-made flakiness and through that ->fragility of the network. (one month they are all for it, next they are going on vacation and have to turn off the computer… or even better - they go on vacation and everything crashes (my life’s experience )
Or as my friend did in their new flat - she put a power tester in a wall socket - just to check if there is power - blew up whole vertical skeleton power line - they had to go to the building basement and reset the breakers that took a few hours… imagine you are living with such a “talented” neighbours…
It might have resulted in maybe malformed DBs if no UPS…

I would shell out cash … better yet I would gladly do it through “self funding”, but will not risk own major cash… at least not at the beginning…
This is a conundrum … won’t make cash without more space-> won’t have space without cash

SGC · August 15, 2020, 5:24pm

i have made my fair share of bad choices for this… first off i bought used harddrives… because… well they seemed cheap… but long term is certainly not going to end up having been cheap… because the failure rate is … cough an interesting data point that makes me want to throw stuff…

the table i got my server on i bumped on day and basically killed a hdd right there… atleast it was a 3tb one… then ofc the server room isn’t heated yet… so last winter the disk endured -17c
aside from that i wasn’t aware about the whole dew point and temperature thing… or atleast i didn’t take it into consideration before running the server in an unheated room…

so yeah it corroded, so much so that the fan wires literally shock the wires loose and the fan’s stopped workings… then i bought a wrong raid controller and components for 350$ only to replace it with 100$ of better gear within 2 months…

my system is setup to deal well with power outages… because i worry about them even if it’s most likely not even a yearly thing…

so that’s not really my consideration… however i do have a leaky roof way to close to where the server is located…

there will always be problems… only question is if you are bothered by problems or enjoy the challenge of finding solutions.