Limit node transfers through node selection

EDIT: WARNING writing forum posts and teaching young kids leads to errors.

The trouble is, the “unlimited” setting isn’t really the problem. There was also an “unlimited” setting.

Your third solution addresses the bandwidth issue nicely by changing the denomination from “number of requests” to an actual bandwidth number.

This is similar but not exactly the same as the removed bandwidth limitation. The removed bandwidth limitation could have been exceeded in a short amount of time if the node’s connection was fast enough. I really like what you proposed. It’s reasonable and contains a fine enough tuner to account for a node operator using the Internet connection for other purposes.

As a SNO affected by IO issue I only want a supported way to move node databases to a faster drive. Node itself doesn’t seem to mind slow writes, only database does.

Speaking of which, why is there even a database? Satellite should handle everything on its own except deletes (need to be cached locally) and disk space usage quota (node’s own business).

The satellites chooses the node list… but the client interacts with those nodes directly for ingress and egress.

The default is already unlimited and that IS a problem for people. There is a reason the old setting is still being used by nodes.

The way I described it it still limits the number of transfers. But I guess it would be possible to store piece sizes as well and pivot that to an actual bandwidth limitation. However, I think transfers have a random write overhead due to database updates as well. So I could see advantages to either option. I didn’t even think of the fact that this could also open the door to nodes with bandwidth caps.

I recommend you take a look inside those db files and see what they are for. But here are some hints; keeping track of used serials so uplinks can’t keep downloading your data, without you getting paid. All the stats on the dashboard. When pieces expire and your node can remove them. etc etc. These aren’t things that can be done without keeping track of some data.

The instantaneous bandwidth requirement has always been unlimited… I never reached 100% of my allocated bandwidth, but the entire month of January was filled with days that reached 75% of my available Internet connection.

The problem with removing bandwidth caps is a different problem from the one facing those working from home with now slower Internet connections.

I wasn’t talking about the bandwidth limit. That change isn’t relevant here. I was talking about the max concurrent requests limit. The no limit default for that setting is causing problems for users who now use it to make their node able to deal with the traffic or prevent their internet connection from being slowed to a crawl. The bandwidth limit has nothing to do with that.

These two things are separate problems.

The number of connections and the bandwidth consumed are not the same thing. Those SNOs attempting to limit bandwidth consumption via dialing down the number of simultaneous connections are likely causing bandwidth efficiency issues for their node. They would be better served by placing a bandwidth restriction on the node network interface. This can be done already while remaining within the ToS bandwidth limitation and alleviating the Internet connection congestion that a given node operator is having… if that Internet is at least wide enough to accommodate the minimum ToS bandwidth and the node operator’s other activities.

This setting gives a connection per minute dial.

And that’s much closer to setting the network interface to a particular bandwdith than simply setting the overall total allowable simultaneous connections.

I don’t think so. While there can be some bandwidth efficiency by using more TCP connections, a higher number of transfers would mean more random IO and it is drive IO that is limiting those nodes. In this case it is better to limit the number of connections or the connection rate. Once the data is read from the drive it can be given to the client at maximum connection speed.

If that was the case, yes. However, the problem that RPi (and similar low power node) users have is that the disk IO or the CPU gets congested. Limiting the number of concurrent transfers helps with this.
A single SMR drive on USB2 can only do so much.

1 Like

Only in the case of the mounted drive being utilized by other services or with the host OS. The recommended node setup is a dedicated drive…

This is a third problem which is not related to bandwidth usage.

Slow nodes are going to be slow. Success rates for slow nodes will be lower. And that’s perfectly fine.

No, this whole suggestion is about slow nodes and not low bandwidth.

Basically, node gets too many requests, cannot process them fast enough and has 5% success rate.
SNO opens the config file, sets maximum concurrent requests to some low number and now his node rejects some requests, but the ones it does not reject get completed faster and the node gets higher success rate and overall more successful uploads/downloads.

The problem is that for a node to reject the request the satellite has to select the node (and in turn not select some other node). If too many nodes end up rejecting the request, an upload or download may fail.

So the suggestion is to somehow make the satellite keep track of hoe many concurrent requests the node has right now and just not select overloaded nodes.

This has nothing to do with internet connection speed or bandwidth limit.

However, in case of a slow internet connection it is also preferable to limit the number of concurrent requests even if that results in less efficient utilization because of fewer TCP connections. The reason is simple - each transfer has to complete as fast as possible. If my connection can, say, serve 20 concurrent transfers and complete each transfer in 10 seconds or serve 5 concurrent transfers and complete each one in 5 seconds, the second option is preferred because with the first option my node will lose almost all races and have success rate of almost zero, while with the second option my node actually has a chance of completing the transfer in time.

2 Likes

I think that most of this discussion pretty much misses the point of the technology behind Storj. This is a distributed system that puts a lot of care to avoid centralization of decisions — to the point that there’s no longer any central entity capable of regulating traffic to a specific storage node and introducing one would be a big endeavor.

Satellites can’t do that, because there are many satellites. If the technology will become successful, these satellites will be controlled by independent entities. It is wishful thinking to hope that they will somehow coördinate between themselves when they were created explicitly to avoid the need for coördination. Coördination also limits scalability.

It wouldn’t make sense for the node to report, let say, “current number of connections available” or even just a boolean flag available/not available. Too much traffic, too much processing on each satellite.

Another problem is malicious users, which could devastate the network by requesting a lot of connections, and never actually making them.

Also, when would the satellite even realize that a given transfer has finished? It needs to know to handle payments, but I don’t think the protocol requires this to be immediate and instead the storage node effectively batches these notifications, again to lower processing costs.

Hence I believe any attempt at making satellites apply hard limits to avoid overwhelming storage nodes will be inefficient or impossible to implement. The only actions that could be reasonable on the storage node side would be to use heuristics (let say, based on rejection rate or races in some time window) to suggest customers to contact some nodes in preference to others, ie. @BrightSilence’s first choice. Which, if I remember correctly, was already considered by Storj.

Also, I’m not so sure max-concurrent-requests actually hurts customers more than limiting bandwidth. In case of dropping connections, libuplink immediately knows to contact a different node for an upload. If, instead, there was a bandwidth limit, then we get a race and only after wasting some time and resources on data transfer the user realizes other nodes are faster.

2 Likes

I feel like you have a really good grasp on the challenge. I pretty much agree with everything you said. Satellites operate independently, so we can’t have any solutions that require coordination between them. I don’t believe any of the proposed solutions require this. But option 2 I described does run into the issue with way too much coordination overhead required between node and satellite. I already mentioned this in the outline and I agree that it’s likely not feasible.

You kind of lost me here though:

The proposed solutions only offer a way to limit transfers where they are otherwise unlimited. Unlimited would still be the default for all nodes that can handle it. You can’t request more than unlimited. And the option to reject would be removed if an alternative is implemented. Of course SNO’s could alter the software to still reject connections, but that’s no different from what can be done right now. It may need to be resolved, but it’s separate from what’s being discussed here.

Orders are being sent once an hour. So this is indeed not a process that can be used to determine the number of active connections.

It’s not a sequential process like you are suggesting. Uploads are started to all 110 selected nodes and when 80 finish, the rest is interrupted. If fewer than 80 finish at all, the uplink will throw an error. As far as I’m aware it doesn’t automatically ask for more nodes from the satellite to upload to. And if this was a segment near the end of a several GB large file, the entire upload fails and leaves behind zombie segments that need to be cleaned up. Therefor rejecting transfers could definitely be a lot more harmful. However, my suggestions also don’t say anything about bandwidth limits. It merely suggests that nodes that might have constraints on the number of transfers they can handle, should not be selected for the transfer at all. The result will be 110 nodes that either have no limit or have capacity within their limits. So 110 fast nodes. This will almost certainly lead to reaching that 80 success threshold and almost certainly much faster than a set of 110 nodes of which quite a few may be struggling to keep up.

I have already mentioned why option 2 is likely not possible and you seem to agree with that. I also don’t like option 1 much, because it still allows nodes to reject transfers. The satellite will try to avoid those nodes more, but it could still lead to failed uploads, especially during sudden peak loads. Additionally during slow times there is no need to select those same nodes any less than other nodes as they would easily be able to keep up. So if you ask me, option 1 is off the table as well.

What remains is option 3. I think your argument against that one is that satellites operate independently and coordination between them should not be assumed (I would go as far as to say that’s impossible, given how the network is set up). There are two options here, with their own advantages and downsides.

3a. The maximum number of uploads/downloads per minute applies to each satellite. So if you set it to 30, it means each satellite can start 30 connections per minute. The upside is that it’s simple. The setting can be communicated to the satellite and it can just follow that guidance. The downside is that if the number of satellites grows significantly in a short period of time, it could still lead to problems.

3b. The maximum number of uploads/downloads is divided by the number of trusted satellites by the storagenode, before sending that number on to the satellite. So a setting of 150 gets divided by 5 trusted satellites and each satellite receives a value of 30 max per minute. The upside, when new satellites get added, the node adjusts for that. The downside, it’s highly unlikely that traffic will be equally spread among satellites. Most satellites likely don’t need any limitation and those that do suddenly get a lower limit when satellites get added. So this could have the opposite effect of limiting satellites too much.

It seems with either solution it would need some tuning to find the right balance. And if that’s the case, I would much prefer the simpler solution 3a. The number can always be adjusted if it becomes a problem again. And at least your node wouldn’t start to slowly get less traffic without SNOs noticing it, which is a risk with 3b.

What are your thoughts on this solution?

How about this:

each node starts with performance coefficient of 1 and it cannot be higher than 1.

If a node is overloaded (set by SNO with concurrent request limit or CPU iowait limit or unix oad limit) the node contacts each satellite and informs it that it is overloaded. The satellite then reduces the performance coefficient by some amount.
When the load drops below a lower limit and stays there for 10 minutes, the node contacts the satellite and informs it that it is almost idle. The satellite increases the performance coefficient by some amount that is smaller than the decrease.

The coefficient is used when selecting a node. If a node would be selected, a random value between 0 and 1 is generated. If the value is less than the coefficient, some other node is selected instead.

1 Like

I like it, but I fear it runs into the same issue as option 2 in the top post. It would require quite a bit of coordination between node and satellite that isn’t there right now. It might run into scalability issues.

Though by limiting the frequency the node updates the satellite at it might be possible. I’ll add it as an option 4 in the top post!

In theory under relatively constant load the value should stabilize, then increase during idle periods and decrease during heavy load.

Hey BrightSilence,

thank you for your input. First, let me point out some global problems, and the implementation of any of your ideas will result in higher centralization of data. Much more, it creates a possibility to manipulate the distribution. But all in all, you are right we need, and we are already working on a weighting system. The gap between the smallest and largest/performant nodes are too high.

Doing this would result in a more complex and slower node selection process, we would lose valuable time in selection while we need to query more data and we need to acquire these data. There is a dependency between upload and download, a node which has trouble with uploads will get in trouble with downloads too. This approach will result in slower disk filling but it will still fill the disk. If the node is to slim from a hardware perspective it doesn’t make sense to store more and more data here (Satellite View!).

Limiting the number of downloads is not an option. If a paying customer wants to download his data he should be able to do this. Any time. We have an expansion factor to compensate for unreachable or lost nodes. But taking out other nodes from the selection will cost speed or in worst it results in the inaccessibility of this file. Second, the satellites are not connected, there is no knowledge of usage from other satellites.

Ok, makes sense!

Not claiming that it would be sequential, though I did think that uploads are initially only started for a subset of suggested nodes. If so, I’d consider having the satellite suggest 140 nodes, make the customer upload to 110 and in case of dropped connections, use the other 30 to fill the gaps. I believe this would be a much smaller change to the current code, yet equivalently effective.

Splitting resources equally still doesn’t make sense. Azure has about 100 “regions”. If Storj had even just that many separate satellites, you’d end up requiring fractional number of uploads or some other tiny number per satellite. And I’d fully expect to see more than that if this technology becomes successful, it becoming natural that some satellites are orders of magnitude bigger than other.

1 Like

While the discussion is interesting…

The basic problem is that slow nodes based upon non-enterprise hardware aren’t going to last long in a full production environment… let alone a SOC experimenter board. A node running on a Raspberry Pi might have a good run for a number of months with low levels of test data. However, if I recall the one month that simulated a large fully loaded network… January 2020… resulted in quite a few lower spec-ed nodes indicating overloaded hardware as well as dying hard drives, and that was before the removal of the number of connections and bandwidth parameters.

Long term, a network of RPi connected drives is just not going to work in production. I remember seeing something similar written by Storj/Tardigrade back in Fall 2019 on their “Launching Soon” web page. They put it a little nicer… something like, “While a network of Raspberry Pis is… something, something… Our minimum spec is… (not RPi)”

Any real-time or near-real time performance tweaking between SNs and Satellites is simply not going to scale. Such solutions are not practical and are not worth addressing.

The only ways to ensure RPi nodes don’t get left behind are to lower the Storj network performance or create a tiered network. Limiting the number of connections is not going to solve the long term issues related to running a commercial application on the lowest-end consumer hardware.

That could be, though IIRC right now RPi+single HDD is the recommended setup. I guess if there is a lot of them then it would be OK.

You know, what if we take my suggestion (option 4) and make it static - configured by the SNO? By default it is maximum, but you can lower it if your node gets overloaded constantly.

Just more nodes would be great. A wider distribution would solve many problems.

1 Like