Ingress with multiple nodes on the same IP

kevink · July 3, 2020, 3:51pm

As we all know, the ingress is split among all nodes behind single public IP /24 subnet.
So I wanted to find out if there are more nodes in my subnet by shutting down one of my 2 nodes.
However, the ingress of the remaining node remained completely unchanged at about 1/2 of the ingress of both nodes. When I started the 2nd node again, the total ingress was back again.

This is a graph over 6 hours:

So now I’m puzzled… Does it take longer than 2 hours for the satellite to realize my node is offline? (So it kept trying to upload to my offline node and therefore ingress on the 1st node did not increase?)

Toyoo · July 3, 2020, 3:57pm

Does it take longer than 2 hours for the satellite to realize my node is offline? (So it kept trying to upload to my offline node and therefore ingress on the 1st node did not increase?)

You can probably test this hypothesis by logging connection attempts at the firewall level?

anon27637763 · July 3, 2020, 3:58pm

Probably some long lead time in node selection process of the Uploading clients…

SGC · July 3, 2020, 8:36pm

the satellites might have leases much like a dhcp server… it knows there is 3 nodes registered on your ip, and since neither of your nodes have been DQ or GE and alternatively shown up on different ip’s … then i would guess they just assign your offline node data and then uploads that when the node comes back…

i’ve seen something like that when i’ve had extended down times… had 1 time of 8-12 hours and another slightly shorter downtime… for some upgrade… os reinstall and whatever went wrong…

after extended downtime when your node gets back its most often ends up with increases in ingress… most likely due to this over assignment of allotment of data…
ofc the live data cannot go on to your node when it’s not there… but to keep data balanced and node profits evenly distributed, it seems the satellites has a target data amount for each node…/ ip

i believe this also plays a role in how some of the SMR drives run into trouble… because even tho the node cannot keep up… the assignment of data will continually increase and try to be pushed on to the node… however this has also been one of the areas they have been tinkering with… to have much less workload on the nodes… ofc that has mainly been through fine tuning of db writes / db options, delayed deletions, and such things…

but yeah it doesn’t really surprise me much that you couldn’t just shutdown a node briefly and see the network reacting…

will most likely take quite a while of dt… one question that does become quite apperent from this test of yours… is … how long does it take… and does it even happen…

if you made a second node and then GE it… hopefully the assignment of data isn’t halved for all time which obviously would be an error… but something that could happen if somebody didn’t think that far ahead…

from what i’ve seen most nodes of about the same age and not full will get a nearly identical ingress graph… this you can utilize to identify if you are sharing with other nodes on your ip… segment subnet… thingamajig…

atleast in theory… its also possible that there are uneven numbers when comparing multiple nodes vs single nodes…

because if we think about data allotment, if you got 3 nodes and 4 data pieces are assigned… then to balance the load between the nodes… they will get one each… so what happens to the last one… is it simply counting towards the next… or does the satellite simply aim for a set amount of data pr ip… or is it more node based…

Mark · July 3, 2020, 10:02pm

This link won’t answer your specific question but could provide some related information
http://storjnet.info/
It has a map of storage nodes, you can pan to your location and press ctrl+rotate middle mouse wheel to zoom in on the map. It seems to have placed my node within 1/2 mile of my actual location. There are about 7 other nodes within a 1 hour drive for me but none of them are super close. I have no idea what ISP they have or how geographically clustered the ip addresses are in this area. Perhaps a port scan of port 28967 on the /24 would reveal more information assuming other nodes use the default port.

SGC · July 4, 2020, 7:10am

i made this one…to try and finally get some real data on these kinds of question…

Toyoo · July 4, 2020, 10:32am

Port scan of all ports within a /24 subnet shouldn’t take much time. The problem is, it might trigger ISP’s IDS and might be against terms of use.

striker43 · July 4, 2020, 11:30am

Maybe STORJ team could add a field to the dashboard that shows how many nodes are in the /24 network. Shouldn’t be so difficult, as the number should already be present somewhere as it‘s needed for data distribution…

Mark · July 4, 2020, 12:52pm

yeah, I wonder if a very slow scan would be better. Not sure how long an IDS would keep previous events in memory.

Toyoo · July 4, 2020, 1:11pm

No point in having it in the satellite←→storage node communication, it would be unnecessary overhead.

I’d rather see the satellite have public API for listing active nodes. This is public information anyway, as any customer with non-trivial amount of data will likely hit all active nodes when uploading. So an API like «how many active nodes there are within IP block A.B.C.__» would be nice.

SGC · July 4, 2020, 1:37pm

thats actually a great idea… i think you should slap that up on the suggestions / vote thing

anon27637763 · July 4, 2020, 1:37pm

It’s not really public information. It’s publicly discoverable… but port scanning is generally considered to be poor behavior – and possibly illegal in some jurisdictions.

In prior versions of the SN software, there was a local database of neighbor nodes. However, this was removed sometime late 2019.

Toyoo · July 4, 2020, 1:56pm

Please read carefully what I wrote. I do not advocate for port scanning. I advocate for either collecting this information by just being an active customer, or asking to implement some kind of node existence API in the satellite itself.

Mark · July 4, 2020, 2:03pm

I don’t think the port scanning comment was intended specifically for you. I brought it up originally, but it’s not something I do personally as I do not want to be mistaken for a hacker.

anon27637763 · July 4, 2020, 2:07pm

Public Information may refer to several scenarios…

All information about SNOs should be restricted to trusted participants in the network. Each trusted participant has a signed x509 certificate. Any network mapping information should be as non-Public as possible in order to protect nodes from DDOS and other attack vectors.

Toyoo · July 4, 2020, 2:22pm

Any person who uploads a few GB of data for literally cents can map out the active nodes because customers in the Storj network connect directly to the storage nodes, so they have to know the IPs/ports and they are given this knowledge by satellites.

And this is by design.

Mark · July 4, 2020, 2:45pm

That seems to work. I ran a TCP connection logger program and uploaded about 100MB to storj. I spread it out over multiple small files and also spread the uploads out over time since I recently read that the uplinks will cache node IP addresses for a few minutes. I filtered the log by process name and removed duplicates and it resulted in a list of about 1200 unique IP addresses. Would probably need to do this for more time to get a more complete picture. I found my own IP address in the list of remote connections so I assume I sent 1 or more pieces of data to my own node but I did not see any other ip address in my own /24, but 1200 IPs is not a complete list either.

Toyoo · July 4, 2020, 3:14pm

Right. If a node is full, it’s not very likely it will be suggested…

Mark · July 4, 2020, 3:26pm

True, but at least they won’t be competing for ingress traffic, assuming the IP selection algorithm is working properly on the satellite. But as we have seen from the original posters experiment, we don’t really know how that works. I’m guessing it’s all there in the source code but I’m not that curious right now.

Toyoo · July 4, 2020, 3:33pm

Satellite’s source code is also public knowledge, and this specific file looks promising. It’s all “documented” in form of code.