I agree with @DD7 here. Although decentralization is key koncept in the storj network, scalability possibilities for node operators should also be present.
Inflow of SNOs could in some theoretic future time saturate, which could bring scalability issues into the network itself.
I think there should be imlemented some kind of “Proof of interest” concept for SNOs. Maybe not the best, but something like: Add some stake (PoS) to the network, so we know you are serious about being bigger SNO and you can have more nodes on same subnet. Up to the point, where it won`t be any problem for the decentralization concept of the network, when (not if) you will have problems with providing the service.
I see that it is allready happening here for SNOs with multiple nodes routed through VPNs. Hey are paying VPS services which acts as PoS in my eyes and storj (probably) quietly accepts that.
Yes, it costs but here is the thing: even paying for the vpn to run multiple nodes with the “fake” ips, you still get better ROI than running a single node and then wait months/years to fill just a few terabytes.
That means people are incentivized to fake decentralization. Given the economics, how many nodes out there you think is not “fake”.
The idea behind Storj is decentralization of storage, which means that the ideal scenario is that all nodes have equal amount of data and traffic. Having some “special” nodes with more data/traffic than usual goes against it. This happens naturally to some extent, since not all nodes have the same speed or capacity, so the smaller ones fill up completely, but bigger ones continue to accumulate data.
Having one node for /24 subnet (as opposed to one node per IP) makes sense since if there is a problem with the ISP, whole /24 is likely to go down at the same time. Different ISPs assign IPs differently, but there is no real way to have special cases for everyone, so the /24 limit is an OK compromise.
There is no way to prove that I am running my node like a datacenter (on in a real datacenter) or that I am running multiple nodes on different hardware and there is no way to limit that. I could start a second node on the same pool, just using my backup ISP and there would be no way for Storj to know that.
However, that being said, there is a lower limit of earnings, below which it is no longer worth the effort to look after the node (“set it and forget it” may work for miners, but not Storj, as downtime can result in losses beyond what could be earned during that time), this is not counting expenses like hardware and power.
STORJ will kill all the cloud services. Cost of storage per terabytes is $4 compared to $23 of AWS. The number is there. No sensible business will stay with AWS (or Google Cloud, etc) even with double the price.
But one big question from the demand side is: How the nodes are incentivized to stay operating?
Imagine you are a fortune 500 company wanting to migrate to STORJ, just to find out that all the nodes are run by either:
a. underpaid decentralized hobbyists who can turn off on a whim for various reasons more likely economic ones, or
b. overpaid centralized large whales with 100s of fake vpn ips, pretending to be decentralized
You wouldn’t bet exabytes of customers data on it , would you?
What they want to hear is:
c. fairly paid decentralized operators who are incentivized to invest for and maintain a certain quality of service, and NOT to fake decentralization
Please note that right now, supply is much higher than demand. This means that so far hobbyists, idealists, enthusiasts are simply enough, Storj just doesn’t store enough data to bring business to professionals. That doesn’t rule out involving and attracting professionals in future—I’m just stating it’s not necessary just yet.
As for gaming the system, there is an idea to more accurately determine location of nodes. The moment Storj Inc. figures out that cheating is too high, they will implement it. Again, right now it seems cheating is not a big enough problem yet.
As for incentives for node operators, Storj Inc. never stated that they’re fixed forever. They will need to be changed to adapt to the changing environment. Maybe at some point some incentives will be developed to attract professionals. Right now, the current incentives seem to attract enough storage owners already.
You’re looking at a young system that does not implement all ideas possible yet, but there are plans to. And if you are impatient, well, Storj Inc. accepts patches.
This makes sense, but I do not know how to do it.
Right now, the price for customers is $4/TB stored and $7/TB egress. Even if all of that money went to the nodes, the nodes would get something like $1.4 and $2.5 (because 1TB of customer data takes up 2.76TB on the nodes).
This would require large nodes with lots of egress to make even $50/month.
Another part is the fact that nodes cannot be trusted. There is no way to prove or disprove that multiple nodes are running on the same or separate array, that the node operator has backup power or a backup ISP.
Filecoin does it by having large initial requirements. It seems that there is no point in starting a filecoin node without a full rack of modern servers. I guess that’s one way to make sure that only “serious” people do it. It is not very decentralized then.
Having lots of small nodes probably is better for reliability, but it may indeed look bad (“hey, boss, I think we should use this company to store our files, the files get stored on lots of raspberries in lots of homes”).
AWS and similar have the advantage that you can run VMs on the same provider and probably pay less to access the data (or at least have it faster). There is no way to run even php on Storj (and sure, Storj is only for storage), but at the same time I am a bit struggling to figure out the use cases for it other than backups.
We appreciate the feedback and keep it coming! We’re going to go into an increasing level of detail on ideas over the next few months. We’ll be hosting bi-weekly twitter spaces and ultimately publishing a economic whitepaper for external feedback in Q1 next year.
If you can join, feel free to bring questions and suggestions. If you can’t join but want to ask questions in advance, we’ll publish a way to submit questions in advance so we can cover them. All the twitter spaces are recorded.
We’re really seeing an steady increase in demand right now and we’re anticipating needing more storage nodes. We want to make sure we have steady growth but not too much growth ahead of demand so that it’s economically rewarding to be a storage node. We also need to ensure storage node operation works for both data centers and individual operators. We also need to make some changes to that the unit economics work out long term for Storj as a satellite operator and for future satellite operators.
There’s another angle on the demand side. Customers have found the geo-fencing feature to be valuable and have requested a variety of different node selection criteria - high speed, SOCII facilities, more granular geo-fence areas, reduced egress or reduced redundancy - and we’re looking at how these request fit into the model.
We’re making our process as transparent as possible, but it’s clear we’re getting traction. Please keep the ideas coming and join us on Twitter Spaces. We may try other formats, but right now this one seems to be working reasonably well.
That seems a little optimistic looking at previous months. This month so far is looking a little better than average though. But since previous months were around 600-700GB, I’m not yet going to assume this month is representative.
@DD7 I’ve updated the earnings estimator with more recent traffic data. It looks a little better, but because more data stored also means more data having a chance of being deleted, there is still a theoretical limit as to how much data a single /24 IP subnet could store. Which is currently around 80TB. At that point the amount of deletes and ingress will roughly match and there will no longer be a net growth.
Additionally about 25% of new ingress is deleted within the same month… so raw ingress numbers are not representative of node growth.
It can, but with current traffic it will take about 5 years to get there.
The network needs scale, not individual nodes. But when that scale is going to be needed, nodes will start filling up faster and faster. Currently every node with available space gets their share and clearly not more scale on nodes is needed at the moment. It’s simple supply and demand determining this requirement. But for the network to work best, it’s a lot better to have more nodes than to have larger nodes.
Also keep in mind for economic incentives that you are competing with pesky node operators like me, who use hardware that mostly wasn’t purchased for this purpose and devices that are online anyway. I have basically 0 costs. So do you think you can run a business with the associated costs when you are competing against people who don’t have to worry about those costs? Yet without any costs, I’ve made about $3000 since march 2019. I’ll take it. But I also would have taken $1000. It’s basically free money anyway.
Furthermore, Storj currently pays more to node operators than they make running the service. This is fine as long as there is a runway and there is still a substantial runway left. But payouts have to come down at some point to make it long term viable.
When this happens, you will see ingress for nodes with remaining free space go up quickly. This situation kind of resolves itself at that point. It will become much more interesting to start new nodes, existing SNO’s who can will expand capacity and new node operators will jump on the now much more profitable bandwagon.
Gotta love classic market effects.
None of this is really relevant. Individual nodes don’t matter at all for reliability in the network. Nodes are always going to run on low powered hardware, because they can run just fine on it. Anyone spending more than they have to is making a bad financial decision.
What those larger companies need to know is that the reliability of individual nodes is not important. All nodes are kept to certain standards and disqualified if they fall below those standards. Such that the network can ensure file retention to 11 9’s even if the nodes themselves run on raspberry pi’s or consumer NAS systems or whatever is available. That’s the whole point of Storj.
On a per storagenode (/24 subnet) level, I’m seeing this increase mostly since this month. Meaning demand this month is outpacing node availability. While that is encouraging, I tend to not get too excited when it’s just a single month. More ingress attracts more SNO’s and it may level out again. Keep em coming though. I’ll make sure to expand as needed as long as it’s profitable. Just this week I expanded my array with a 20TB HDD and started a new node on the 4TB HDD it replaced in addition to a 3TB HDD my dad no longer needed. So far it has always been worth it to expand, even though I don’t buy most of the space I add. That 20TB was expensive… but paid for with Storj earnings. As long as I can keep doing that, I’ll make sure to have plenty space available.
Yeah, this place is great. Most people love to share their experience and contribute to a better shared knowledge between node operators. Distributed knowledge on top of distributed storage. Gotta love it.
There are several startups, who trying to achieve that. The main problem, that the operator have an unrestricted access to the customer’s processes running on their orchestrator.
So you cannot use it for business, where information have a value.
Storage is simple because it has a simple abstract API: put a file, get a file, list files, delete. VMs are much more complex, I’ve seen attempts to do so on consumer hardware since 2006 and none worked so far. Customers would need to write software so that it can run regardless of architecture, and you can’t usually execute a single job in a distributed way like storage can do with parity.
What could probably work is some prepackaged tools that need to work on bigger data blobs. I’d imagine a distributed data structure server similar to what Redis is. Maybe a distributed grep/sed? Maybe a distributed index into binary files, let say, based on some distributed hash table algorithm? Maybe a distributed log, or a locking mechanism based on Paxos or Raft?
It’s still a lot more work than just focusing on storage, but way easier to do than full VMs, safer for node operators, fitting into hardware that storage nodes already use, and still quite usable—for some customers more attractive than just VMs.
Yeah, 5x even. I had one month when I made $500 with just 4TB stored (don’t get too excited, I doubt we’ll ever see something like that again).
That said I’m at 45 months in now and with current network behavior by that time you would still have made over $1800. And ingress does seem to be growing, so you can still make some good money.
Yeah, I managed to sell part of it much higher as well and probably more than tripled that $3K, but I’m not counting that. You could easily also lose, so I’m just going by the amounts that were paid out.
I understand how a platform/business can run on such model. Actually it’s a brilliant idea: recovering people’s sunk cost - unused capacity. The quality control is handled at the level of the network, not the nodes. Most of the reliability (or unreliability) of the nodes is already accounted for and priced in. New investment in dedicated hardware just can’t beat that price.
If true, then this is the answer. Thank you for enlightening me.
That’s the idea. We suggest to use only hardware which is online anyway, so all costs already paid, and literally any income is a pure profit.
The quality is controlled on the network level, if some nodes did not qualify - they will be disqualified, data recovered using Reed-Solomon erasure codes to other more reliable nodes.
So investments are not required, however, it’s your decision anyway, just account all costs.
Reliability and performance are also factors. I’m a (potential) customer, and while the price is attractive I’m still using B2 for my main storage.
I uploaded a few GB of test data some moons ago and went to download it now, initially it very nearly maxed out my 850Mb/s internet connection, but the transfer dropped to well below 100Mb/s for a while, spiked back to 300Mb/s for a little bit, and then fell to below 50Mb/s and stayed there.
I ran it a few more times, it is reasonably consistent from run to run, right around 84Mb/s so let’s use that.
Pulling the same block of test data from B2 has a nice vaguely even rate of 500Mb/s across the entire set.
In a backup / disaster recovery scenario, the difference between being able to restore my 43.2TB at 500Mb/s (about a day a bit over a week) vs 84Mb/s (52 days) is the difference between “Okay, that sucked” and hiring a bankruptcy trustee. The monthly cost 172.80/month vs 221/month doesn’t even matter.
Back-of-the-napkin-math if the only consideration was the amount of time I’d be paying employees to sit around waiting, my break-even is to have a disaster every 15 few years per employee. Obviously that isn’t all that matters as most businesses cannot just stop serving customers for 2 months and then bounce back, so the long-term impact would be far more severe.
I’m ignoring transaction costs, download costs, everything else because it is all a rounding error.
Is backups the only use-case for Storj? Probably not. But it is my main use for “dumb” cloud storage so that’s what I’m using as a starting point.
43.2TB isn’t a massive number either, feel free to multiply everything by whatever figure you like, it only makes the problem worse (either the restore time gets astronomical, or the savings drop to the point that it isn’t worth the time it took to do the math).
I like the concept of Storj, and I’m considering moving a replica of my personal backup just to support the concept (and save a few dollars) since restore speed is less relevant.
EDIT: Fixed a math error inline, see comments below for what I dumbed.