Discussion on scaling the network - a multi-node SNO tier proposal

No need to feel sorry. :- )

Well there has to be subnet, /24 rule.

Because what is the point of subnet filter rule?
to make sure, one physical machine doesnā€™t get more pieces of a file, to ensure, if it goes down, other machines in the network will carry the availability, the decentralization, ā€œnot all eggs to one basketā€.

i believe now the draw just look at unique IP subnets, and sends only one piece of a file to each network.
(And which node within subnet catches it, depends which will respond faster to a request? is it right @Alexey ? Or the satellite knows the endā€™s Node ID up in front?)
i think /24 rule now draws IP subnet and thats it.
if some node inside respond to a file, it gets it.


But thatā€™s bad from nodes point of view, because in same subnet network can be even 255 nodes.
So those potential 255, rivals for that 1 piece always, only because they are in same subnet.

if 1 home SNO got like 1 machine, and 10 nodes, with 10 IPs for the external world, it gets 100% of data it possibly can from the network, to fill the HDD and get maximum payout.

Same, but with no ip hiding,
each node gets 10% ingress of what it could. (if not to hide itā€™s real ip.)

Not to mention what if 1 SNO and 255 nodes, no ip hiding,
that would be 0,39% of normal ingress, currently ~8,72 GB per month of ingress, per node.

Thatā€™s why currently many bypass that via VPNs.
How many such nodes has more than 1 pieces of same files? Does anyone knowsā€¦

On the other hand, a cheater sets his nodes to different countries often, so does he really getting pieces of same file across his nodes? The further he chooses its external IP, the bigger the latency.
So such nodes may land in different pools of drawings.

Decentralization guaranteed by code, We all want it, but You need to know which nodes are cheating, and are in fact under one machine and location. So the code should be able to determinate, in various ways, what is the real nodes IP. And later, its just data analyzing and matchmaking.

Whether one way to think about it, is to not waste efforts for chasing bypassers, and just keep going forward, because the shadow part is always there, just accepting it,

but at the same time, i saw an issues at github about why so many nodes are slow?
thatā€™s one reason, because so much use VPNs to circumvent /24 rule.
And VPNs often restrict upload drastically, to 3-5Mbps (egress in terms of Storj)

Itā€™s a chemotherapy-like solution, it kills the bad guys, but it also kills the good guys.
Whats needed is a cure, so the bad guys fall off, and the good guys stays untouched.

For 1.000.000 1000 STORJ every month, till STORJ exists,
i will tell You how to identify rule /24 bypassers.
Deal? because it will also eliminate me, as im doing this for 4 years.
Therefore i know how to eliminate those like me.

First obviously, all my nodes are on different email.
All my nodes are with different eth address.
All my nodes are on virtual machine, 1 location, 1 internet connection, 1 router.
All my nodes are on VPN, to make it look like each is from different country.
but first thing to trace is (1550 more wordsā€¦)

if You identify bypasserā€™s nodes, and slash them ingress to what it should be, they will be in position, that keeping VPN no longer gives them benefits, but costs, so they have no other choice than to drop it.

illustration

dropit

Without VPNs, theirs nodeā€™s upload speed will rise few times.
Or they will quit. (Good, eliminating bad guys)
And Youā€™ll cure the network one way or another.

So with my hints, You will be able to know which nodes are held by one owner.
But that doesnā€™t mean they are in one location yet.
if node is using VPN, then on such node You will obtain certain critical information,
You will combine that with other nodes, and be able to see if they are from same subnet, and to flag them as such.

But wait!
Thereā€™s more!

I would like to be even more help than that.
From beginning iā€™m eagerly reporting every error i find, and submitting ways to upgrade things,
so i can offer my will to test all that, and whatever more We come up with.
My workstation, dedicated to 14 Storj nodes is all Yours, from now on, we can run tests on it.
Letā€™s see if, with my guidance You can correlate all my nodes on VPNs as in one /24 network.
With my assist, You can perform a proof of concept on real circumventing nodes.

Effects guaranteed!
or return of funds!

money back guarantee2


Call NOW!
@john

P.S. jokes aside.
There was ideas, here on forum,
but no one said crucial things that only bypasser can know. I read the whole thread, it was missing the points, so i wrote a draft to reveal what has not been grasped, but didnā€™t publish. Was afraid, if You would proceed accordingly, it would end my nodes.
But when i see the project is in need, i want to come and help, say it all,
just put me on the payroll.

No, itā€™s random. The long tail cancelation will take care of slow nodes, uplink still requests 110 nodes (unique subnets) and cancel slowest after finishing uploads of the configured number of pieces (currently 80).

There are ways how to detect VPN nodes, and VPS as well, however they are not 100%
I believe that solution is exist. It would be better to implement it to the protocol than trying to catch all bypass ways, because itā€™s endless cat and mice game.

2 Likes

I believe that the subnet limitation should be removed. Itā€™s clear that the network is full of VPS, both to bypass that limitation and to avoid CG-NAT.

VPN and VPS port forwarding only generate greater latency in the network and bandwidth limitation since many VPS are with Bandwith limited.

There are many nodes with IP addresses from other countries and multiple IPs to achieve more traffic, and that itself is also a problem for clients looking to host their data only in the USA or only in the EU. Whatā€™s the point of wanting data in the EU if in the end it goes from EU satellite ā†’ EU VPS ā†’ USA Node?

Another point of view is that with the reduction in prices, often having a VPS is no longer profitable, earning a couple of dollars per TB and then spending 3-10 dollars for a decent VPS. I believe many Node operators would be happy to no longer need to pay some VPS. :wink:

There are many Node operators with a large bandwidth and good hardware to run several nodes and provide good quality of service.

Perhaps they should consider a change or a vote from the node operators.

Thatā€™s my small opinion as a node operator since the beginning of StorJ :slight_smile:

1 Like

I donā€™t think this would work in most node operatorā€™s favor. There are some data centers out there with high speed, low latency connections that will absorb most of the data. The subnet rule is allowing data to flow more evenly across the network. Removal of it would mean less data to average node operators, not more.

The argument that the use of VPS/VPN to circumvent is happening anyway and we should then lose the filter doesnā€™t necessarily mean the end users would stop using VPN/VPS. They might not believe we have removed the filter. They may also want data that is geo-fenced. So, weā€™d still have that same issue.

There are lists of VPN and VPS IPs out there that we could blanket ban from the Storj network if it is perceived as a problem. That would be a decision for the executives to make if they feel as though the network is in jeopardy due to VPN/VPS saturation. Of course even then, it isnā€™t going to be perfect. And I donā€™t think the process was ever meant to be 100%. It was just a way to help more equitably spread the data around without risking it becoming centralized.

3 Likes

Agreed. We can debate the effectiveness of the /24 logicā€¦ but it is a barrier from a small group of large SNOs taking all the uploads. Having to work to control multiple IPs may be easier or harder for different SNOsā€¦ but itā€™s something. Donā€™t remove it.

Likeā€¦ nevermind companies with datacentersā€¦ this dude is doubling from 20PB to 40PB (and beyond)ā€¦ in his garage. So he could soon host the entire paid Storj space, as a hobby, on the used gear stacked up beside his lawnmower :wink:

(Heā€™s doing Chia, not Storjā€¦ but the amount of space he controls isnā€™t unique - so you get what Iā€™m saying)

2 Likes

I do not agree, using VPS/VPN increase costs for the abuser, and the easiest solution would be to reduce price for SNO even more. I would suggest to do not abuse this loophole, it has several negative impacts:

  1. If we lose a segment (and using VPN/VPS is increasing the chance, because the current statistical model consider nodes failures independent, so if itā€™s followed - we have not risk, but if abused - the risk is increasing exponentially), we lose the customer, reputation and it may destroy the network, so no customers, no payouts to anyone. Itā€™s the same as cutting off the branch youā€™re sitting on.
  2. We will be forced to cut prices for SNO even more to make VPN/VPS nodes highly unprofitable. Again, affecting anyone.
  3. This one:

So, it would be better to implement the protection to the protocol instead. But until itā€™s not implemented, the best protection what we have is to force the /24 subnet filtering.

They may consider to participate in the Storj Select program, if they are also eligible for SOC2.

1 Like

One way to mitigate the risk of nodes failing because they have some common dependency even if they are in separate /24 is to make sure not too many pieces are in the same AS.

It could go something like this:

The satellite decides on 110 nodes to host the segment.
A check is performed against the global BGP table for every node, and a counter is increased for every encountered originator AS number.
If one of the counters is more than say, 10 (this will be unlikely) the satellite chooses another 110 nodes. Repeat.

The satellite will need a copy of the BGP routing table (one can be downloaded periodically) and some fast lookup algoritm.

This will spread pieces evenly over VPN providers as well, so for someones garage setup to be a problem they will need to buy from many separate vpn providers. It will not be perfect of course, but it could work as an additional line of defence in the very very rare case node selection has a bit of bad luck.