An Ultimate backup! Parts on all nodes!

Alexey · August 9, 2023, 5:41am

Sorry, but it’s not needed to quote the whole post, otherwise it would be not readable. You always can click on arrow button to see the whole post.

You provided a lot of concerns regarding investments into hardware which I cannot suggest, support or promote, it’s also against our own suggestions to use what you have now.

For the earning purposes there will be even more incentive to try to break a /24 subnet limit (which is makes sure to place 1 of 80 pieces to these 8 nodes), because if they stop, they will receive in 8 times less data, and doesn’t matter if it could be in almost in 2 times more paid, it’s still will be in 4 times less than before.
I believe if we increase price for storage and remove egress, this not altruistic actor will not stop using VPN, because now it will get in almost 2x times more, than before.

this should be properly modeled, if this proportion would work, it could be implemented.

I still think that without a proper incentive to do not throttle egress it what will happen. If you are not paid for egress the byzantine actor will likely throttle it by selecting a more cheap ISP plan (as you suggesting to make investments, then everything should be considered).

Ruskiem · August 9, 2023, 6:58am

Yes that’s true, BUT think, what effect it would have on the network, it all won’t happen instantaneously. First, if You will be able to lower egress cost for customers from $7TB to $2,5. I foresee the usage of STORJ will grow and data inflow can increase only God’s knows how much, but i think there is no other option, as to arrange these prices better for the needs of SNOs, Customers, and STORJ inc. simultaneously, and find out!
Only then, depending on how much that data inflow will increase, the process of dropping VPNs by SNOs can start. But it’s a process, during such, there always be SOME with VPNs, still gaming the rule /24! BUT if the increase of data flow will be big enough, so 20TB HDD, can fill in reasonable short time, then VPNs will be just unnecessary cost!

I’m talking here, like crazy fast!
like in 6-12 months, or even 3 months, then who needs VPNs to bypass /24 rule. VPN beside cost, are constant pain, it can disconnect by itself, or the app will freeze. They impose transfer limits too, may happen it will be faster to fill the HDD without VPN!
They also causing delays in latency even with downloads, and i don’t remember, is there a race for downloads(ingress) among nodes too?

We can rid off any byzantine actor willing to cut egress or internet plan,

You are setting the requirements for upload and download for given size of a node.
You implement measures to check if node is keeping the agreement.
You can suspend and disqualified a node, if not so.

That’s egress. And internet plan: SNO might be better off to have higher internet plan, in order to fill the HDD faster to get paid better. Were talking about scenario where STORJ grows in terms 20TB in 3-6 months per SNO average, So SNO could freely add HDDs to existing nodes, without need to bypass /24 rule, because the filling will not be any faster with VPN on each HDD. Talking about benefits in filling HDDs from having better download speed, without VPN, surpassing benefits from having nodes on VPN, getting same pieces on each node, but in the end slower. Also VPN’s could not withstand so much traffic for the price, recently Mullvad VPN closed port forwarding, officially because some abuse in content people hosted, but in practice? they got very good speeds of upload, and cost only $5/mo per 5 devices, unlimited bandwidth, i think traffic got them too. VPN’s usually has low upload speeds, so now a SNO even without making egress intentionally lower, is risking suspension or DQ if You implement egress speed audit daily.
Shouldn’t say that, because who knows, maybe You will implement that fast and i’m in trouble too soon… hopes nobody reads that!

i think I touched the heart of the matter… And a proverb comes to my mind, which clue is somehow like that: “People tend to do sensible things when all else fails.” My concerns are, You will do some things anyway, You don’t think off doing now, but could be little late, when nodes count will keep falling, and customers won’t be coming. I think, i foresees that. Don’t want to take Your time now, i’ll maybe write You back here if counter shows much less nodes. Thank you for your responses, which allowed me to clarify more details.

Alexey · August 9, 2023, 7:19am

That’s the problematic part. The traffic flows directly between customers and nodes without any middleman (except when they using a Gateway MT). So you actually cannot check the speed. It also depends on a distance (and number of hops) between the node and the customer. The speed between your node and the customer from the next apartment likely will be higher than between your node and the customer on the other point of Earth.
The speed between the node and auditors doesn’t matter, unless we implement a distributed network of independent auditors (see Distribute audits across storagenodes). The distributed network of auditors could be used to get some believable statistics, which could be difficult to trick. However it’s difficult to implement.

Ruskiem · August 9, 2023, 8:30am

Can’t You just make an account as customer, name it “auditor”, update satellites option, so they can select that one client account “auditor” and enable for him to upload to every node there is?
if only customer can measure, become one, who uploads to all nodes a file once. And just download it everyday, measuring the transfer speed. For example: a 5MB file, or whatever least size possible, to measure correctly the speed? At the same time, You will make possible for the satellites to enable my idea of storing a piece on all nodes

BrightSilence · August 9, 2023, 9:16am

Sure, but they could do that with the gateway MT already. Which would already have a lot of actual traffic to monitor. The issue is not that they can’t create a single measuring entity, the issue is that that would only measure from a single source location, instead of from all around the world. They could of course create a flock of globally distributed “customers” or “auditors” to be a little bit more precise, but that’s quite a lot of overhead to manage. And a single source location just doesn’t provide enough information.

BrightSilence · August 9, 2023, 10:03am

It could be done safely by increasing the total number of pieces in such a way that the reliability remains the same. I did some calculations a long time ago, but I don’t remember exactly. It was something like going from 29/80 to 60/120 that would give similar reliability. However, this has many different side effects as well. While it could speed up transfers for larger files, it will also lead to more connections and could increase overhead for smaller files. There could be an argument to also increase segment size to prevent pieces on nodes from getting even smaller than they already are, but that again has broad ranging impacts and limits the amount of parallel segment transfers larger files could use.

All of this said, I don’t think this should be a customer setting. If these changes are made, they should be made responsibly and by Storj Labs after doing the appropriate analysis. As for increasing the redundancy, this shouldn’t be necessary. Storj has already never lost a file and other aspects than node redundancy become the bigger risk anyway. (Storj Labs going out of business, satellite issues, coding mistakes causing catastrophic failure.) It doesn’t make sense to have the customer decide to increase only one aspect of redundancy, without them knowing the exact impact of that, while also leaving all those other aspects unchanged. In my opinion this would give a false sense of higher reliability and it would just be ripping the customers off and filling the pockets of node operators and Storj Labs. (On second thought… I’m a node operator, let’s do it! )
For example, a customer might think that doubling the number of pieces doubles the reliability (29/80 → 29/160), but in reality, this is like hundreds or thousands of times more reliable (didn’t feel like doing the math). The top post already shows this misconception by suggestion an outrageous amount of 1000 pieces for a segment or even storing pieces on all nodes. This would also require a lot more CPU to do the erasure coding btw. The upload would be extremely resource heavy and take a long time. At some point the question is, do you want to pay the normal amount and lose no files, or do you want to waist resources and money by paying ovey by paying over 10 times as much and still… lose no files.
Storj can’t afford the reputation hit of losing files. So since that’s already a necessity, let them balance it to make sure files are stored reliably and don’t offer customers a meaningless reliability upgrade just to squeeze more money out of them.

That’s a different story though. This could indeed really help and would be a much better reason to offer options. But I think this would really work best when the dynamic scaling suggested in the whitepaper is implemented. Peak demand is volatile and storing many pieces all the time is very wasteful. Besides, since this is inherently a high egress scenario, egress income could cover the cost of expansion (which would basically be the same process as repair, except to create new pieces instead of replacing lost ones) and the cost of additional storage temporarily.

That would do the opposite. You would go from an expansion factor of 80/29~=2.76 to 60/20=3. That would be more expensive and might even be less reliable. Maybe you meant 60/29, but that would definitely be less reliable, especially if you lower the repair threshold as well. Definitely a no go.

Yes, I believe 110 piece transfers are started and the slowest 30 are cut off.

Ruskiem · August 10, 2023, 12:52am

Oh, hi Bright!
hah i like it!
lets have a fun!

Even if i just want make sure a fille will be archived as long STORJ network operates?
Example: Say i have old documents, i belive the network will live on, like bitcoin does, potentially forever, and i scan them and upload to STORJ, as my archive of choice. I want maximum possible assurance, that they will be there as long as network exists. I won’t be downloading it too often, but willing to pay more for good storage.

I’m private person, but say, i’m government, or institut. Fair use case for the settings?

hahahahha!
a 100MB file would be like 74GB to upload to all 22000 nodes!
or imagine a developer of a leading program with new version,
or better, a game premiere, with 70GB download installer.
70GB file would be 51,8 TB to upload! ha ha haaaaaa…

My 300Mbps home connection, would complete that task in …
(37,5MB/s = 2250MB/min = 131,83GB/hour.
51,8TB = 53043GB,
so 53043GB / 131,83GB = 16,76
so ~ 17 days!

hahaa, but that’s option is definitely for professionals, not me, a home user.
But say, a few KB wallet.dat, i can upload easily.

i read that article 6.1 Alexey gave link to, and i don’t like that part:

" If a file’s demand starts to grow more than current
resources can serve, the Satellite has an opportunity to temporarily pause accesses if nec-
essary, increase the redundancy of the file over more storage nodes, and then continue
allowing access"

how that would look like?
is that would be reliable?
imagine a video is an sensation, and suddenly milion of people want to acces it, and its stored in 160 or 240 pieces, what’s then?
Will there be time for any pausing access, increasing redundancy (and to how much? how to know, to how much more nodes increase it?)

Therefore i think, if You are publishing website, or got new software release,
that it is wiser to give the file more redundancy up front.
Still You don’t know how much, but at least You can give more by default!

STORJ customers are no derps. This service is designated as an enterprise class.
i guess those people know what they are doing and why, when they doing it.
Maybe You should ask them, how much they are interested in such option.
i know i would expect it to be working by default, so i don’t have to think of any manual settings, in terms of CDN. But if STORj don’t have it? There should be some easy way for customer to enable the file to scale for thousands/millions of people, either after spotting a demand, or up front. What do You think?

Because without this, You can’t just invite more customers, and more traffic to the network.
Which is crucial for STORJ grow and survival, to do so, to invite people to host videos for mass audience, but the files have to scale somehow! Or those customers will be disappointed and angry.

Yeah, and look, i just think we need to better arrange these puzzles of price and benefits, for the current needs of all: SNOs, Customers and STORJ inc. at the same time.

I laid out my reasoning for changes.
No.1 instead $7/TB egress, making a $2,5/TB, could greatly move up usage for customers,
resulting: X times increase for video sharing (which is ~80% of internet’s traffic)
and other multimedia.

Changes resulting from my proposal:

STORJ inc. CHANGE:

+$1,5/TB egress profit (from $1 to $2,5 for every TB traffic.)
+25% to 15% from every TB stored profit (from -$0,13 to +$1 or +$1,6 for every TB stored.)

Customers CHANGE:

2 times more for storage BUT 2,8 times LESS cost of traffic
nominally no change (from $4/TB stor. and $7/TB egress to $7,9-8,5/TB stor. and $2,5/TB egress)

SNOs CHANGE:

+1$ per TB storage (from $1,5/TB to $2,5/TB)
BUT egress from $6/TB to $0/TB (or $X/TB if STORJ inc. want to share its pool of $2,5/TB)*

*The problem is the egress is non existing now.
i blame current rate for customers at $7/TB, where classic cloud offers as low as 1,19 euro/TB (with first 20TB free) like hetzner.com
In this situation SNOs, are condemn to slow death, with no egress “wind” and with too low payment overall. To repair this situation, SNOs could get flat rate for the node operation in total.
And maybe get some additional reward for any egress as $0,5/TB, or even $1,5/TB if, STORJ inc. is willing to keep it’s existing egress profit rate at $1/TB. The hope is, lowering customers price for traffic, would result in increased interest and traffic. That in time, would even surpass current SNOs earnings from egress, as if it was $6/TB, but with new rate.

Additionally SNOs, would at start, result in overall higher payout even with $0/TB for egress than currently with $6/TB egress.

Alexey wrote:

And isn’t it what we face RIGHT now?
(beside no one said free egress for customers.)
Egress for SNOs is theoretically $6/TB but is almost not paid practically.
Last weeks the ingress are good, because nodes are leaving (was 26k is 21,7k)
I’m turning my node’s VPNs to WG right now, so they can win more races for ingress pieces.
Because my VPN in WireGuard mode has less upload, but fantastic download speed and latency.
Will have faster ingress, and low egress. Exactly what you feared.
i hope to fill my HDDs faster, offering lower egress availability at the same time.

Alexey! the situation You are describing is happening right NOW!
Very low egress in the network traffic - says it all, they are using it as cold storage now.

I’m calling to implement ASAP, a nodes upstream bandwidth audit, in case to make possible those crucial and vital for STORJ survival changes!
Next im calling to implement ASAP the piece scalling based on Whitepaper | 6.1 Hot files and content delivery (page 63)
IN ORDER to make much needed changes, more or less, as i was able to laid out, possible.

Alexey, can You make somehow text of this post visible for readers of " Update Proposal for Storage Node Operators - Open for Comments" thread in the Announcement?

Alexey · August 10, 2023, 1:26am

You may post a link to your post there, or I can move your post there.

BrightSilence · August 10, 2023, 1:35am

Yes, even then. It’s a balance, file reliability as a result of redundancy covers just that single risk factor. Let’s say that factor ensures 11 nines of reliability. Perhaps a coding mistake leading to data loss is 13 nines of reliability. Would it then make sense to go for a redundancy reliability of 20 nines? Not even a little, because that is no longer the issue of concern. I guess I could understand a small settings range where it would make sense, but 1000 or even all nodes. Hell no. At that point you’ve made that aspect of risk less likely to happen than a world ending apocalyptic event.

Not in the numbers discussed. See my response above.

First of all, it’s not just about the file size. Erasure coding uses CPU and RAM as well. Second, who has only 100MB of data to store? What if you wanted to store 100GB or 100TB. Even that is super small time for larger businesses.

Yeah, that would be a bad way to do it for sure. I would be okay with priority access for the job that creates more pieces, but blocking it is not a good way to do it. That’s a simple tweak though. The core concept is still solid.
I’m not opposed to an option to set higher availability from the start, but I would be opposed to marketing it as a higher reliability option. If it’s sold as a way to be able to serve large demand, I’m perfectly fine with it, but I still think even that should be dynamic in some way. So you don’t end up with customers setting a file to high demand and leaving all those pieces around while that demand is long gone.

I think it should not be put on the customer to make a determination of the amount of pieces. Storj may not have it now, but it can be built. Why not provide the customer with an option to set files to scale pieces on demand. Just a single switch, instead of forcing the customer to manage all the Reed Solomon settings. They don’t need that hassle.

Btw. I’ve stayed away from the pricing discussion here for a reason. I think all the suggestions have been made and there is not much left to say about it. I hope Storj Labs is considering the suggestions and we’ll see where they land. I’m more interested in the technical side.

Ruskiem · August 10, 2023, 1:48am

Love it!
If, only this scaling could act fast in real time. And have a sense of falling excess, after which it would slowly reduce the surplus of pieces, saving storage cost.