Bandwidth utilization comparison thread

Pac · December 12, 2020, 9:20am

That’s weird, I have almost nothing in the trash on my side

I do have more repair ingress than regular ingress too though indeed. Dunno if it’s good news for the network… ^^’

But as long as my disks keep filling up, I’m not complaining

SGC · December 12, 2020, 9:32am

the more data that get on the network the more repair will be done…
and the older the network gets the more data become worn enough for repair.

so i don’t think there is anything wrong with repair…
trash is from what i can see is most often when one have large deletes… it’s basically a 7 day or so hold before stuff is deleted to make sure it can be undeleted or something…
not sure if all deletes goes in to trash either… been told how it works a few times, but i doesn’t seem to have stuck… or maybe it wasn’t a clear easy to understand answer

i get like 12 GB ingress /w rep a day, but thats for 3 nodes with 4 on the subnet…
sadly haven’t managed to get my internet issues fixed… ordered a few new ipaddresses and to manage that they wanted 200$ a month lol

BrightSilence · December 12, 2020, 10:44am

They don’t. Normal deletes go to your node and show up as a delete. Those will be removed from your HDD right away. Trash usually accumulates when your node has been offline or has timed out on the deletes when they were sent to them. In the past we’ve had some zombie segment cleanups on the satellite that lead to trash being used as well. But as far as I can tell that hasn’t happened in a long time. So my trash has been basically empty for months now.

SGC · December 12, 2020, 10:57am

been switch isp and restructuring my entire local network, and going to switch isp again within a month… yeah my trash did show up after my isp change related downtime.
duno why i can’t seem to remember that… maybe its the name…
i’m sure it will stick eventually lol…

so the idea with the trash is that if the repair is initiated again, it can revive the trash from nodes that initially held the data…thus saving on repair costs maybe…
that would sort of make sense, and also help distribute the data across nodes in the network in different locations…

hopefully my downtime will not to be extensive any more…
thanks for setting it straight… ill try to remember this time lol

BrightSilence · December 12, 2020, 11:02am

I missed one common scenario. If data is repaired and moved to other nodes while your node is offline it will end up in trash on your node as well.

The idea behind it has to do with how garbage collection works. Garbage collection uses a bloom filter, which you can see as a set of instructions to select pieces. Bloom filters match all pieces that should be on your node (and some that shouldn’t be, it’s not a perfect system). So your node moves all pieces that don’t match the bloom filter to trash. The reason they aren’t removed entirely right away is just to catch situations where by accident an empty bloom filter is sent or a faulty bloom filter. That would instantly end your node if it happened. But the trash allows the node to recover from such a mistake if needed.

With normal operation, trash is never restored. So no, it’s not about repair costs.

SGC · December 12, 2020, 11:18am

i kinda understood the whole concept of the bloom filter in another way…

but both ways can be true, i just think it’s a conceptually easy way to think about it.
the method is a bit like those news paper images… find 5 differences

it can be difficult to see changes in a complex system and difficult to monitor it…
but if you put two images over top of one another then the difference is obvious.

to my understanding the whole concept of a bloom filter works by that principle, ofc just on a digital level and ofc nothing like that simple… it’s just a nice way of explaining it… interesting how close it is to… i was in reading about bloom filters on wiki.

and because of the inherent nature of how the algorithm filters only “certainly not” part of a set, then since that part is just a matter of mirroring the instruction and one would get the mirror part of the certainly not set… but then it would essentially leave an outline of undetermined inside of it.

maybe thats where the mathematican i heard the overlay image explanation of bloom’s got the idea from, or who ever got the idea.

i should really get better at the whole algorithm game…
it’s all in the algorithm as they say lol

reading about it tho it is very far from the image example… atleast so far as i can understand lol…

Pac · December 12, 2020, 3:13pm

Makes sense to me. I was wondering actually if someone knows what the average TTL of a piece is on the tardigrade network before needing repair? 6 months? 2 years?

Did anyone try to track this?

SGC · December 12, 2020, 7:02pm

that’s a good question, not sure anyone aside from storj labs can answer that tho…

and since downtime can get pieces trashed if repair is started, then i would think there is a certain flow rate to it… even if we try to have good up time.

i think that repair thing is really a critical part of keeping the network alive… obviously… but also because i think it helps distribute data more evenly, even if some SNO’s try big setups to get more data… in theory if they have even the slightest downtime it might cause repairs and data will go else where…

been pondering where that mechanism was for a long time, pretty sure the way repair works causes good distribution… and then it might also not matter how much a piece lives, since it’s only important that the mathematical string that en erasure coding is survives.

kevink · December 12, 2020, 7:04pm

Satellites consider nodes offline after they have been offline for 4 hours to prevent short downtimes from causing repairs. (maybe there’s more to it but that certainly is one reason). I don’t think satellites are checking the pieces all the time when a node goes offline, that would be an immense workload.

SGC · December 12, 2020, 7:32pm

most likely only if they work with them… and maybe some sort of superficial check or data base which then will trigger alerts when repairs needs to start…

maybe it keeps less track of stuff with ample ample pieces… i duno… i’m sure there is some smart solution which was talked directly from the basics of erasure coding fundamentals.

as this kind of stuff deals with the same kind of basic conceptual problems that would affect any method using the technology.

my 9 and 24 hours sure did hurt… lost like close to 300gb maybe a bit more…

does kinda make me ponder if uptime helps increase egress long term, because one will end up having much more critical pieces or high use pieces… ofc that would depend on my assumption that it checks the pieces or EC when it’s being utilized …

what i don’t understand in that tho is why delete / trash the pieces when they come back to the network… sure to punish nodes that are offline and for data distribution over the network… but is that really it… that would mean the whole thing about stuff going to trash after having downtime is like one of the most critical parts of the survival of the data… ofc aside from repair, and erasure coding…

But still critical feature that creates distribution of data, basically…

when i get around to running multiple nodes i will have to run a test on that… see if just a “bit” of downtime from time to time can have a big effect on long term egress

would ofc be more than 4 hours… tho that would mean i would have to put my nodes at a disadvantage to test it… maybe just on one then heh
doubt it will matter… but… have to do something to keep it interesting,

kevink · December 12, 2020, 7:43pm

when a piece gets repaired, the satellite downloads all shards, reconstructs the piece and uploads it to the network again. So if your node held any shard of that piece while you were offline, it’ll go to trash afterwards.

well there is lots of deletes and repair so likely you lose a bit of storage. And your node is rather big so more likely to lose some pieces.

SGC · December 12, 2020, 8:30pm

yeah it’s on the whole nothing big… but in a future where there are lots of customers, downtime might weed out high activity pieces, or thats the theory i kinda want to test…

will i be able to create a measurable difference between a node with regular 4+ downtime as compare to one with out it… over lets say a 6 - 12 month period

hopefully it won’t matter… but if it does then it’s useful knowledge, ofc i don’t really expect to have 4 hour downtimes… so kinda pointless really…
but if it does matter, i would work a bit more towards avoiding it…
ofc one could also argue that making the experiment now with lots of test data floating around will not give any reliable results anyways…

the reupload explains a lot… ofc that would also mean that it’s not an individual node that gets punished for the downtime… but the network in general…

i mean there will have to be a node that make it “break” into repair, so because of that node, everybody else that holds a piece of that “EC” will essentially be punished…
i suppose that would make my experiment even more pointless… because even without downtime we would essentially be affected by the same effect every day…

re uploading everything is a highly inefficient way of doing that tho… weird that there isn’t a better way

andrew2.hart · December 13, 2020, 1:53am

Come on my friend,

Vetting node ingest is extra ingest
Defend your subnet; If you have many nodes on “your” ip then you will not be affected by an upstart “stealing” your ingress
If you have a disaster on your main. You want a vetted, even 15 month if possible, node ready to go.
Expansion. When storj takes off you need to be ready to scale out.
There are limits to a node. The file walk means that a node above 100TB is probably prohibited.

Besides, I know you’ve got more than one already.
Anyway, my storj friend, I’ve typed too much and I don’t want to challenge you as master of the tldr!
And.

dragonhogan · December 13, 2020, 2:37am

totally agree with this post. I currently have 3 nodes at the moment. Started each one in a staggered fashion. First one in Aug 2019, second in April 2020, and the third back in late Sep/early Oct 2020. The first two have 12TB HDDs with 10.8TB available for Storj. First one has about 8.8TB used, second one has 4.4TB used. Then for the third node, I’ve just been using a 1TB portable, external HDD (2.5" running on only power from USB port of Raspberry pi 4) as a temporary drive to “seed” the new node, with 500GB allocated to Storj. That has about 250GB, currently. And actually I’m currently in the process of migrating that data to a new Raspberry pi with a 4TB HDD attached to it. I’m thinking I’ll probably leave that allocation at 500GB for the time being, and then repurpose the 1TB HDD for a new 4th node, so I can get that seeded. I luckily have another old 4TB WD Red HDD laying around, along with another 10TB WD Elements HDD that I can continue to use for expanding offered space.

I have high confidence in the future of Storj, and can certainly see using both spare drives at some point, once the others start filling up. Although, for the time being, I’m just planning on keeping them shelved as spare drives if I need to replace one of the active nodes. I also anticipate having to replace two of my four drives in my plex server at some point over the next year or two, so that will give me two WD Red 6TB HDDs to use for Storj if need be.

SGC · December 13, 2020, 6:15am

i got 2 tiny ones that barely counts and they are all on the same subnet, so basically on node anyways.
so cannot really compare how it behaves when i don’t have two nodes created on the same time, on the same subnet, thats what i mean… and when i start running multiple nodes… more subnet because right now i’m basically just one.

100TB is unreachable unless if deletion pr month would be like sub 1% and ingress is larger than 1TB
if we can hit 30TB we might be lucky at present, ofc this will be a moving goal line.
depending on customer behavior and storj labs ofc.

vetting nodes still takes from the total ingress of a subnet

yeah ingress have been rather dead lately, so starting new nodes right now is a rather inopportune time, as you might also have seen during the townhall, we had gone from 600 or whatever nodes up to like 4000…

so essentially test data is dead, the network is growing at a pace where storj most likely doesn’t have the ability or capital to pump into it anymore to artificially inflate nodes.

been kinda stuck on one subnet because i refuse to setup anything less than perfect
with a foreign storagenode…

one weird thing about my tiny new nodes, is that the 3rd one or 2nd one of the new ones, doesn’t seem to have completed vetting now some 2 months later… for some reason it is getting less ingress than the others even tho, it’s like 2 months now, but i guess with 4 nodes on a subnet that kind of stuff goes slow… even if people seem to say it doesn’t…

should have looked into that, but haven’t bothered because i’ve been trying to get my former new ISP to cooperate, but that was a bust… so going to get a 3rd one on monday, or atleast initiate talks to figure out if they will actually do what the others claimed they would before i actually signed up and they just basically didn’t care until after nearly 14 days of trying to get them to do what they promised, they finally let me up the support level so somebody was not drooling into their keyboards enough that they could define if they wanted to sell the solution to me or not.

which then would cost me hundreds of dollars just for the connection… the same connection…
so now the goal is to bother them enough to make them give me my money back for the connection i never used, doubt that will happen… but i can try i do have my mastery to rely on…
tldr kungfu

and the funny part of the story is that my other choice of ISP will actually do the setup i wanted, cheaper than the first one… and only reason i saw the others as a possible solution was because i doubted a big company would bother doing that setup for a consumer, and when i called the others they said it wasn’t a problem.

@dragonhogan
but yeah it sure does look like we might be in for a long haul while we compete with the semi enterprise setups… tardigrade got popular, now many people are running nodes, some won’t persist.

atleast it’s most likely good for the customers so there is that

coldsc · December 13, 2020, 6:18am

My first week stats.

7 audits total which seems really low with 98% up and 92% down success rate.

SGC · December 13, 2020, 6:27am

does seem kinda low… but it also requires data for you to get audits…
so it will be slowest at first… and pickup exponentially…
duno if this is to low, it’s certainly not high.

this node is 2 months… but it’s one of 4 on this particular subnet.

you can check this if you are alone on the subnet
not 100% accurate but like 95-98%
and should be 100% for older larger nodes, something to do with how it works.
http://storjnet.info/neighbors

your node looks fine… it’s just slow at first… very very slow at first.
and the network have just had huge growth in the last quarter so many more people to take data.

if there are others on your IP/24 subnet then the data will be shared between all node equally,

kevink · December 13, 2020, 7:59am

no. Vetted nodes get 95% of global traffic, unvetted nodes 5%.

(You even answered directly below my post linking to that answer in a different thread… Ingress traffic splitting between vetted and non-vetted node on same subnet - #8 by SGC)

SGC · December 13, 2020, 8:54am

that might actually explain why my 2 month old node isn’t vetted yet and why coldsc
has only 7 audits over a week.

if global traffic is fixed at 5% for vetted nodes, and there have been an increase of 3000 nodes over the last quarter… that could essentially stall the vetting of new nodes due to the global traffic remaining the same, but the number of nodes going up by 5 fold…

that would mean that ingress is about to drop like a stone when all the nodes vetting comes online…
only reason why vetting would take months is because there is an insane amount of nodes being vetted.

not a good prediction that…
wonder if they made it that way… the 5% of global instead of subnets to ensure a slow growth of the network, because it would create a sort of static amount of new nodes that can vet in a certain time equal to the total ingress / repair ingress on the network.

thus if the network grows faster it will vet more nodes faster and if its not growing node vetting will slow down…

that would actually be pretty smart for the stability of the network and profits of SNO’s
ofc it also means that if somebody in theory now created 10000 nodes they could get something like 4% of the total network bandwidth routed to them…

which is kinda crazy… ofc they would have to maintain all those nodes, because most will go into held amount, which then becomes a critical stop gap measure… but still seems a bit crazy still.

ofc when the vetting nodes comes out of vetting then the whole 10k node thing goes down hill … so it would be kinda pointless i suppose…

SGC · December 13, 2020, 8:58am

lots of things to keep track of, sorry my bad… pretty sure it’s stuck now… should be, because now it makes sense why it is like it is