BTW, the discussion is not helped by the fact that given the recent ratio between vetted and unvetted nodes, the original goal of reducing the amount of data stored on new nodes would now be achieved by disabling the vetting process. A case where implementation does not fulfill intentions.
You are exactly right Vadim - and the conversation around GE and the problems with it seem to have fallen on deaf ears at Storj.
I’ve done GE for 160 satellites and its only worked less than 30% of the time.
also vetting 500 nodes on a single ip takes forever and is a waste of time, imho…
basically the vetting can be calculated…
lets say vetting a single node takes 1 week…
sure it might not really , but it does for the highly active satellites…
so it would take 500 weeks for 500 nodes to get the same audits to be vetted.
because vetting is based on audits and audits is on data.
thus the only effect this has is to reduce held amounts and ensure 100% payouts when the nodes are activated…
is that cheating… maybe… it sure makes the held amount even more irrelevant than it was in the first place… held amounts worked pretty okay in the past… around the end of the testnet.
but now held amounts are a joke.
one example, i got a 6.5TB node that has 4.89$ in held amount and 5.31$ already paid…
to GE this node i would have to upload 6.5TB or mostly… lets say just half of it even…
so 3.25TB uploaded and i get paid 4.89$ for that…
thats less than 1/3 the current egress prices.
last time i did a GE it took 60 days to finish, and i ended up having to restructure my network to better handle the load.
so why TheVan even bothers with doing this is a mystery to me.
seems like a waste of time for little to no reward.
If you read a bit more into the story he had them on vps systems and then moved them to be behind 1 ip - so likely they are already vetted.
Completely a joke.
Started the first backup node about ~85 weeks ago.
its true, but we can marginalize it.
Because if Your a whale and got decommissioned, post server, 5 years old HDDs, which i don’t believe they are for free, but even if, you can sell them, or You can use them with STORJ.
Those are most often enterprise grade HDDs, they can work +5 to 7 years easy.
Normal people haven’t got this advantage.
Normal people got some average HDDs for home end user, with a lot less durability by design.
You have to earn enough in the project for the value of the HDD at minimum, before it dies.
it can take 2-3 years of a node operation to cover just costs. Whales can afford that.
Normal home people can’t. So in current model, it discourages normal people to join STORJ, coz reward is too low. And whales still finds ways to operate and profit.
This is bad for node decentralization, if you can’t join a network with 1 hard drive, you have, say, 6 TB or 4 TB to make sense for you.
That’s why i proposed a concept, where normal people with average HDDs could get return under 1 year, or even in months! Whale’s will still be present, but this will raise the number of home SNOs joining, and increase decentralization at same time.
A concept, where download of a content is very cheap and STILL nodes earn more than now, and STORJ inc. earns more than now! And STORJ will stop being just a cold storage!
It can make STORJ a global revolution, not just a curiosity, they are afraid to trust. if the price is low enough, the temptation is just too strong. And I’ve shown that it can work for everyone, as long as customers agree.
It has more to with how much money you have and how much access you have to datacenter hardware then it takes a bit of technical skill to be able to setup more then 500 and a memory to keep track of them all. Its not really about being butt hurt its more about where storj is headed, We already gotten a price decrease because of how many whales there are this will eventually push out any kinda decentralization that storj has portrayed onto everyone. Its gonna be whales against other whales and all data will be in 10 datacenters around the world instead of in every country spread across the world. Where ever there is money there will be a whale and a datacenter attached to it. If not a datacenter then people will be buying IPs but all data will be in one place.
And yes at this point storj wouldnt survive without whales for this reason because it takes money and lots of it to host this many servers around the world and the whales are putting the bill for storj so why would they try to stop it.
No it’s not, since there is no software rewritten or any attempt to interfere with the network. Alghough, I feel with you as I felt concerning using VPN’s to increase the storage size. This one is kind of the opposite. These are kind of loopholes, which are not quite aligned with the spirit of STORJ.
Especially incubating drives, is undermining the idea of the vetting process. Which many pros and cons already have been discussed here. And from previous posts, I also understood some other fellas are doing comparable things (incubating large drives, but only offering a small part so the held back amount is low; increasing the drive size later).
At one side @Th3Van is serving the network very well, renting storage. Since he’s running a data center, I am tempted to believe it’s professional so the data is being stored more robust than many small operators do (including RAID 6, although I’m not being fan of it, in each case some kind of redundancy with chance of recovery; instead of RAID0 which I at least do).
At the other side, this would mean of the 3PB he’s offering only a meager 40-100TB would be used since then there would be an equillibrium between the amount of uploads and deletions if they were behind one public IP. But if I see the overview, that’s not what happened, and also not the case: th3van.dk literally says there are 3-4 SN’s behind one /24-subnet. Meaning, al these storagesnodes that were down were treated like 125 unique storagenodes before (107 if I check the overview).
The real question in my opinion should be: can this be a threat to the network as a whole. Than there’s some math involved in that question, I did before.
Primary data is:
- Ca. 21000 SNs
- Ca. 12000 unique /24-subnets
- This fella is using 107 unique subnets.
For every piece offered on the network:
- Chance he will get it: 107/12000 = 0.9% (probably a bit higher, being in Amsterdam and having a good connection; but taking this for calculation).
Dividing 80 pieces, which is the case at the STORJ-network:
- Chance he will get no piece:
((12000-107)/12000)^80 = 51.2%
- Chance he will get one piece:
C(80,1)*(107/12000)^1*((12000-107)/12000)^79 = 35.5%
- Chance he will get two pieces:
C(80,2)*(107/12000)^2*((12000-107)/12000)^78 = 12.5%
- Chance he will get three pieces:
- Chance he will get four pieces:
C(80,4)*(107/12000)^4*((12000-107)/12000)^76 = 0.5%
- Chance he will get five pieces:
C(80,5)*(107/12000)^5*((12000-107)/12000)^75 = 0.1%
- Chance he will get six pieces:
C(80,6)*(107/12000)^6*((12000-107)/12000)^74 = 0.01%
- Chance he will get more than six pieces:
100%-[previous numbers] = 0.00081% (=1/123000).
This means chance of rendering 6 or more pieces redendancy-wise useless, is quite nihil. Even if they were al divided over a network with whales like him and they would have all an independend uptime of 80+%, it would be nearly impossible to reach the point that 59 pieces were not available. This actually is also a plea, to lower the N-number of the Salomon-Reeds-equation used by the STORJ-network at the moment.
Storj is also dealing with the possibility that all Russia’s nodes may at some point be cut off.
Yeah, so? Then why 80, of which 51 redundancy duplicates?
Even then, most data is also from Russian customers in that case.
If you can hand over better metrics to explain why, than I’m more than interested. But I actually don’t see, why so many duplicates are necessary.
This is absolutely NOT the case.
You don’t need this advantage, and honestly probably wouldn’t be much of an advantage anyway as drives that old won’t have the capacity / efficiency of newer drives, plus you’d have drives dying left and right (relatively speaking). Personally I would rather buy brand new helium filled power efficient high capacity drives that will be reliable long term… and yes, at scale IS profitable even with the expense of purchasing multiple IP’s. This is quite literally the only way running Storj nodes are profitable. At least to the point where it’s actually worth your time messing around with. And let’s face it, if there’s no profit in it, nobody (with few exceptions of course) other than those hoping for that potential someday (most of which don’t actually understand the economics of it in the first place) will run nodes and Storj would evaporate into thin air.
You can’t expect Storj (or any other similar platform) to pay you more than a “whale” simply because your smaller and have higher operating costs. This simply won’t happen.
Alright, some data to back that up?
I mean, as far as I know, we’re al in the dark concerning the reasons why.
As far as I know, 1625 of 12441 /24-subnets are Russian, consisting of 2774 separate nodes.
So, the metrics again:
Dividing 80 pieces, which is the case at the STORJ-network (assuming equal distribution):
- Chance Russia will get no piece:
((12441-1625)/12441)^80 = 0.01%
- Chance Russia will get one piece:
C(80,1)*(1625/12441)^1*((12441-1625)/12441)^79 = 0.02%
- Chance Russia will get two pieces:
C(80,2)*(1625/12441)^2*((12441-1625)/12441)^78 = 0.12%
I did it in Excel, showing this:
Amount of pieces Chance Cumulative <= N 0 0,001% 0,001% 1 0,016% 0,018% 2 0,098% 0,116% 3 0,382% 0,497% 4 1,104% 1,602% 5 2,522% 4,124% 6 4,737% 8,861% 7 7,523% 16,384% 8 10,314% 26,698% 9 12,396% 39,094% 10 13,223% 52,317% 11 12,642% 64,960% 12 10,922% 75,881% 13 8,583% 84,464% 14 6,171% 90,635% 15 4,080% 94,715% 16 2,490% 97,205% 17 1,408% 98,613% 18 0,741% 99,354% 19 0,363% 99,717% 20 0,166% 99,883% 21 0,071% 99,955% 22 0,029% 99,983% 23 0,011% 99,994% 24 0,004% 99,998% 25 0,001% 99,999% 26 0,000% 100,000% 27 0,000% 100,000% 28 0,000% 100,000% 29 0,000% 100,000% 30 0,000% 100,000% 31 0,000% 100,000% 32 0,000% 100,000% 33 0,000% 100,000% 34 0,000% 100,000% 35 0,000% 100,000% 36 0,000% 100,000% 37 0,000% 100,000% 38 0,000% 100,000% 39 0,000% 100,000% 40 0,000% 100,000% 41 0,000% 100,000% 42 0,000% 100,000% 43 0,000% 100,000% 44 0,000% 100,000% 45 0,000% 100,000% 46 0,000% 100,000% 47 0,000% 100,000% 48 0,000% 100,000% 49 0,000% 100,000% 50 0,000% 100,000% 51 0,000% 100,000%
As you can see, this risk is covered with 25 pieces… Maybe some more, for data of Russian customers, who aren’t allowed to stay on Storj anyway in case of cut off. So again, please some metrics to cover the matter…
For example, if you would have taken N=60 for K=29 in the Reeds-Salomon-comparison:
Amount of pieces Chance Cumulative <= N 0 0,023% 0,023% 1 0,203% 0,226% 2 0,900% 1,126% 3 2,614% 3,740% 4 5,597% 9,336% 5 9,417% 18,754% 6 12,970% 31,723% 7 15,032% 46,755% 8 14,962% 61,717% 9 12,988% 74,704% 10 9,951% 84,656% 11 6,796% 91,452% 12 4,169% 95,621% 13 2,313% 97,934% 14 1,167% 99,100% 15 0,537% 99,638% 16 0,227% 99,865% 17 0,088% 99,953% 18 0,032% 99,985% 19 0,011% 99,996% 20 0,003% 99,999% 21 0,001% 100,000% 22 0,000% 100,000% 23 0,000% 100,000% 24 0,000% 100,000% 25 0,000% 100,000% 26 0,000% 100,000% 27 0,000% 100,000% 28 0,000% 100,000% 29 0,000% 100,000% 30 0,000% 100,000% 31 0,000% 100,000%
You see, even 20 would be sufficient in this case… (because of lower chance of pieces ending up in Russia). And you would have 11 of them for other occurences, like loss and downtime of other nodes.
If even Russia has so small probabilities, than why there is so many spoken about datacenters who have 100 nodes and 100IP, then there is probabilities that even 2 pieces is almost nothing or even nothing. then it more looks that people are envy that they cant do such setups and therefore try to make that no one can.
All hardware, that I’m using for storj nodes are placed in our DC, that are equipped with cooling system, automatic diesel generator, fire extinguishing system and multi ISP uplinks (BGP AS49974).
I think you are mixing my Primary nodes and my Backup nodes together :
Primary storage nodes : (Running on dedicated hardware - list can be found at www.th3van.dk)
- Number of nodes : 105 (001 and 032 are not present since they do not exist)
- Number of IP’s (/24 subnets) in use for the 105 storage nodes : 30 (~3,5 nodes per subnet)
- Number of SAS HDDs for SN data : 105 (no RAID - one HDD per storage node)
- Total available space for SN : 1.986 TB
- Used space for SN : 1.191 TB
- Free available space for SN : 795 TB
- First node joined : 13-07-2021
Backup storage nodes : (Running on other dedicated hardware - but where down for a couple of hours, a few days ago)
- Number of nodes : 501
- Number of IP’s (/24 subnets) in use for the 501 storage nodes : 1
- Number of SATA SSD’s for SN data : 8 (Samsung 4 TB in RAID 6)
- Total available space for SN : 24 TB
- Used space for SN : 17 TB
- Free available space for SN : 8 TB
- First node joined : 22-12-2021
I think 10 is a little low of an estimate there even if your only accounting for datacenters… however how is this not still decentralized relative to typical cloud infrastructure? Look at what’s happened to crypto mining in general. It’s become centralized in the sense that it’s centralized around people with money to invest, but still decentralized by nature among them and around the world. This has been fought over and over again by using different methods, largely using different algorithms to resist ASICs, but even so it was always an inevitability. This is just simply how the world works.
As I’ve said before, the answer to the whale issue is simply more whales… but if you don’t have the technical skills to run something at that scale you simply shouldn’t be doing it… especially those of you who seem to be so concerned with the risks whales supposedly pose to the network… risks of which are already factored into Storj’s model.
One way to look at it as far as I’m concerned, is that Storj (maybe on purpose, maybe accidentally) has done a pretty good job of weeding out those with the technical skills to run scaled setups from those who might otherwise acquire 100’s of TBs of data before ever running into even a minor issue they can’t resolve on their own. And trust me, scaled setups definitely see more issues.
Furthermore, whales are actually financially incentivized to keep their servers running, and there’s a heavy cost associated with any downtime. If this wan’t the case, there would be no whales in the first place. This is also largely the reason behind “node incubation”. If a whale happens to loose a node for whatever reason, they want to replace it as quickly as possible with a node that’s ready to go in order to utilize the hardware and maintain their efficiency as quickly as possible to minimize losses.
In the case of whales, the vetting process is essentially pointless as there’s already a pretty high probability those nodes won’t be going offline anyway, but clearly it doesn’t hinder anything either so there’s really no point in discussing it.
On the other hand, you’d be better off arguing about node incubation in terms of the held amount which would actually carry some weight. However if Storj sees this as an issue, the simple solution would be to forego the current holding model for one that simply holds a specific amount proportional to the repair cost relative to the size of the node that adjusts over time as opposed to an ambiguous figure based on node performance during a predetermined period of time. End of discussion.
Even amazon has way more datacenters do we consider them decentralized?
Again this doesnt require technical skills it requires money lots and lots of money
I build my setup with very small investment, most of it paid by storj over time, i just expand step by step
you just have to know or want to learn how to do it.
Ok, then you’re using a total of 106 /24-subnets. Doesn’t matter that much for the story, I would say.
But there are actually tree things, I don’t understand:
- 3PB was down which you attributed to the 501 nodes down, but these figures seem not to add up actually, since that would mean 6TB/node instead of the 24TB stated here.
- What’s your plan with them? 501 is quite many…
- Since they’re all behind one IP, you should be on the equilibrium of the amount of deleting balancing ingress. But direct show from your data. Or did I overlook something, like older data is being deleted less?
Although, I just say, it’s a clever approach. Especially if you started them all, back then when you still got an application fee.
That’s exactly the point I’m trying to make, as I was also making when we had the same kind of discussion about the use of VPN’s. They aren’t that bad after all from the perspective of the network. But, because they need to increase the redundancy by one or two pieces, it feels a bit unfair to those who don’t have the skills, aren’t clever or both; who are being paid a little bit less because of it (less ingress is less STORJ-token, more redundancy is less payment per piece).
They have their own redundancy, but it’s a single company in control of it… so no.
Yes it does require money. But you can’t tell me hosting 4 TB on a Raspberry Pi with an external hard drive is the same as hosting 100’s + across many servers, maintaining the equipment, managing technical issues, incorporating UPS systems, backup generators etc as many whales do. It’s not at all the same thing.