How does Tardigrade avoid clusters of nodes being correlated to physical location of SNOs?

BrightSilence · August 26, 2020, 6:12pm

Thanks, that helps already!

I have a slight concern that I’m sure is shared among some other node operators that running a node in an area where this is popular is going to at some point work against me. By definition most of your SNOs will be in areas… where most of your SNOs are.
I’m also a little worried that using other measures to acquire multiple IPs could create false perception of distribution. Meaning when you’re not getting them from a local ISP. The routes will be completely different for those IPs, even though the nodes may still be on the same hardware. It’s a tough problem to tackle.

Toyoo · August 26, 2020, 8:11pm

While indeed I also recognize this as a problem, I can see at least one (minor) advantage: redundant links should result in less downtime (assuming some fail-over mechanism configured). Around here ISPs are less than stellar and it feels as if more problems can be attributed to consumer-grade ISP link problems than to consumer-grade hardware failure.

BrightSilence · August 26, 2020, 8:27pm

If you’re using a VPN to get those IP’s there’s is nothing redundant about it since they still rely on the same ISP connection. It just won’t look like that from the other end.

Toyoo · August 26, 2020, 8:31pm

Ok. I was considering the scenario of two actually separate ISP links. I had this kind of setup myself for the duration of the local COVID lockdown.

SGC · August 26, 2020, 8:40pm

been thinking a bit about this… the only answer i have arrived at is one has to basically register an address for the node, tracking it by IP is borderline impossible… one could send traffic between nodes to try and estimate their distances, but again this sort of stuff is just an arms race of who has the best ideas of spend the most time on it…

so eventually i ended at the only way i can see a possible way to actually to a reasonable degree of certainty is some sort of node owner / address certification … certification might not be the right word… and then data is distributed based on this type of certified location / node owner rather than IP subnets, each SNO would then configure which ip’s are theirs…

this solves some future issues and even would allow multiple ip addresses being used as backup, but without giving anyone / any location an unfair advantage…

best idea i could come up with anyways…

ofc some locations would be less likely to go down… like a big data center, but does that mean it should be favored over somebody else if they keep the same uptime… just because they have near infinite internet bandwidth and basically infinite ip addresses…

and when you aren’t in a major crowded city its pretty easy to just designate legal land claims as node locations… which would also fall pretty well inline with the whole distribution thing… because isn’t that what it’s all about… its about making sure the network / data / nodes is distributed over the correct areas… and land areas are all marked up by legal land claims of ownership and such… would make sense to utilize that public resource.

ofc in highly crowded cities it becomes a bit more problematic… because when is the network properly distributed… if you cut a powerline and an entire city block looses connection… is that workable… ofc the same might be true for much large geological areas which is routed through one network nexus.

so as usual solving one problem creates new problems…

SGC · August 26, 2020, 9:02pm

another approach i came up with was to have the network basically have a higher aware controller… that would shut parts of the network off and see if it could still access its data…

like if we imagine it being a distributed system, no real fixed location, but its critically aware of all the isp’s and such… then it will simply disregard all traffic from a single isp or from a subnet and such… ofc this doesn’t really solve the IP subnet issue… but i’m sure it would make the network more resilient long term…

i kinda like this idea tho, but it just doesn’t really solve much… because of stuff like vpn’s

it would work if it could be this disembodied entity pulling the cables from either isp and seeing how the network behaved… but without the ability to do this the idea kinda falls apart…

works in theory… just not in reality… but maybe somebody can improve upon it…
i kinda liked it… the concept has it’s own merits and advantages… such as it would be able to identity connections getting redirected via vpns… because if one pulled the plug of the isp it was coming from… it wouldn’t really matter… one just can’t really do that… not even virtually… that i’m aware of anyways…

mike · August 27, 2020, 6:51pm

Quick thought…

What if nodes on known VPN’s had to register with a location somehow?

If netflix and others can identify them… I’m sure such a resource list could be put together…

BrightSilence · August 28, 2020, 2:55am

I wish it were that easy, but there are plenty of VPNs that Netflix doesn’t detect too. That’s kind of an endless cat and mouse game. And there are also legitimate reasons for SNOs to use one. Like when their carrier has put them behind a CGN.

deathlessdd · August 28, 2020, 3:08am

The simple fix to this issue is to ban VPNs all together. We know most of the community is using VPNs to take advantage of it having more nodes in one location, and hiding the true location of there actual nodes to make the most profit they can. If your ISP doesnt allow public ports its just simple do not allow them to use the service because it would stop the people who are using them to there own advantages. It really defeats the whole purpose of storj to be honest if they allow anyone to use VPNs…

Toyoo · August 28, 2020, 6:59am

Is it really that widespread that it is justified to use the word “most”?

BrightSilence · August 28, 2020, 7:08am

I don’t think we can conclude that.

It’s kind of beside the point though. Even if the decision is to not allow VPN despite that causing some people to not be able to run a node at all (and booting a number of nodes off the network all at once, which is pretty risky), it’s still near impossible to detect the use of VPNs.

However, if the focus starts to become more on filtering based on routes, then hopefully routes to popular VPN’s with port forwarding options will quickly become congested and running a node on one won’t be as profitable. Especially as more people start to “cheat” using those VPN’s it’ll quickly become a race to the bottom. But this probably will only work when Storj gets larger. I will feel for the people who are just using it to get around a CGN when that time comes though.

SGC · August 28, 2020, 8:17am

@Toyoo
well it only takes 5% of SNO to have 10 nodes each for them to be 1/3 of the entire nodes on the network… so depending on how the problem looks in reality, it could indeed be close to something like that…

but i don’t think it’s that bad… but the math tells that its possible…
however i don’t believe its most…

maybe the easiest solution is simply to put bounties on people exploiting the network, make everybody rat on each other… i mean if @deathlessdd is right, then thats already kinda the mentality… lol

@BrightSilence i think there are more vpn providers than there are nodes on the network…

Alexey · August 28, 2020, 8:27am

With port forwarding feature? Not so much to be honest.

SGC · August 28, 2020, 8:31am

so there are less vpn’s than there are millions of people on the planet…
seems totally reasonable… i think you might fail to grasp just how big the world is…

Vadim · August 28, 2020, 10:52am

It is not a problem at all.
For every vpn channel need pay money.
If you make 10 channels it will cost lot of money and will be not profitable. As you will need aslo more expencive hardware to handle it all.

deathlessdd · August 28, 2020, 12:22pm

I’ve been here a long time and I’ve seen people talk there’s many who only care about how much they can make. Also why run 20+ nodes in your house if your running on the same internet I can tell you it much cheaper to pay for a VPN or other ways of tricking your location. Then paying for a second internet to your house…

Vadim · August 28, 2020, 1:13pm

I tried both ways, when i had 10 IPs my router was very hot all the time. Also need very good connection and speed. There is very big connections count, normal user user cant aford it. Also need very good managing skill, 1 error all crashed. Now i Have 2 ISP fiber connections, work much better.

hoarder · August 28, 2020, 7:45pm

In what kind of situation will running a node through vpn make economical sense?

It will not speed up initial vetting. It could give you a bit of an edge when filling a node, but half of the additional payment you get from this will be held by storj. Cheap VPN services will be slow, and if you rent a VPS, you will need to pay for twice the amount of traffic because it will be counted both ways. You will lose more races than you normally would because a vpn connection will add extra latency. On top of it, it’s an extra point of failure.

BrightSilence · August 28, 2020, 8:22pm

There are economical hurdles, sure. And it would involve some upfront cost. But you don’t need to use an additional IP until the node is vetted. When it is, using a different IP will make it fill up significantly faster. If you have enough space available, by about month 4 you will make more on that node than most good VPNs would cost. So, first months is vetting and no use of VPN. Month 2 and 3 you run at a lost, from that point on it’s in theory profitable. This is only really interesting if you have massive amounts of space to share and I definitely don’t recommend doing this either way. But it is probably economically viable as long as you have the disk space, hardware and a really fast internet connection.

This is why I said it would only work when Storj grows into a larger network and these VPNs will start sharing more nodes. But as Alexey said, there aren’t actually all that many of them that both perform well AND support port forwarding AND are affordable enough to consider.

So yeah, in short, I think this is still currently an issue that needs to be taken into account. But there may be ways to resolve it, especially as Storj grows.

SGC · August 29, 2020, 7:34pm

@BrightSilence
bouncing around large bandwidth sure ain’t cheap…
and really the only really benefit is that one can go above other people in max ingress and capacity because of the whole deletion % compared to daily or monthly ingress.

maybe the solution is the reverse… make it not worthwhile to have multiple ip’s to game the system… if one could do that then there would be essentially no point…

ofc that comes with it’s own problems… i did like what @jtolio talked about with the latency being a deciding factor… ofc that would most likely just compound the whole german issue…

i suppose the whole idea behind the latency is pretty good… because you cannot shield that…
your latency will be better depending on your location… sure it might do so better performing nodes would get a bit more of an advantage… but is that really so unfair

then what one would need is like servers in the nexus’s of the internet and then use those to sort of triangulate rough geographical locations for nodes… that’s actually a pretty viable idea…
but that was how i understood it from what jtolio explained from the townhall, and taking a bit of creative liberties. to imagine it in a frame of mind i can understand…