The held back fees are described in the T&Cs as being based on a percentage of the Storage Node Fees accrued during such defined time periods of the nodes life and cover the first nine months using a quarterly sliding scale. With the logic being that the sum accrued will cover the repair costs of the data lost if the node goes offline.
From what I can tell this method of calculating the charge may be simple to code, but very arbitrary in nature. For the following reasons
All examples I have found have the fee being charged on the total monthly income and so includes the Egress Revenue (and possible the Repair Revenue). The fee is by definition a Storage Revenue fee and has nothing to do with the other revenue streams.
As described the fee can be gamed by adding storage to an already active node. If I was to take my current 1TB test node and in a few months increase the space to 10TB the current fee structure does not seem to correctly build up the correct balance as it accrues based on the life of a node and not its used capacity.
The amount accrued over the nine months is also going to be rather arbitrary as it depends on how fast data is pushed to a node. All the examples I have found shows it taking months for a node to be filled. My 1TB test system took days and so will end up with a higher overall amount held than a system that took 2-3 months.
As described it does not correctly handle any reduction in the amount of data held on a node. The ablity to do this is currently limited (but possible) but seems that it will be far easier in the future.
Even the Graceful Exit Release process is rather arbitrary as the description indicates that it has to 100% complete to release the outstanding funds. Surely it should payout based on the percentage of data that has been correctly handled.
None of these are a show stopper for a start-up business with 1.X processes and code in place, but in time they seem to need a revision.
I think the held back amount shouldn’t be considered to be directly related to paying for needed repairs. It is partially used for that, but it’s just one of a few reason it exists.
First of all I think it is good to know that it really doesn’t cover the repair costs in almost all situations. It’s simply not enough usually to pay for repairing the data. But that’s good for both sides, because now the SNO and Storjlabs both have an incentive to keep reliable nodes on the network. You sometimes see people suggesting that Storjlabs rather just disqualifies a node to not have to pay out the held back amount. This is clearly not the case as that would just be more expensive to them.
Most importantly the held back amount is there to incentivise node operators to run long term reliable nodes. It ensures they don’t mess with their node and take good care of it. If there wouldn’t be a build up before you start making money and a loss if you mess up, SNO’s would be much more careless.
So yes, the amounts that actually end up in held back are rather arbitrary and even in the scenario where the highest amounts end up being held back, they aren’t enough to pay for the repairs. But this doesn’t matter as much as that is only the secondary function of why it is there in the first place.
As for graceful exit, you need to successfully transfer 90% of pieces, not 100%. The reason for this is mostly because graceful exit should not be a viable option for a node to avoid disqualification. If enough data was lost to cause graceful exit to fail, your node would have been disqualified soon anyway. And since the held amount is already not enough to cover repairs should that happen, paying back part of it simply doesn’t make sense. That node would have broken the agreed upon service and should have to pay the consequences.
It’s good to hear there is just a 90% requirement for the graceful exit process. This detail has not made it into the T&Cs, where instead it is just described in 18.104.22.168 as “complete a Graceful Exit” and in 4.1.8 as “Storage Materials stored on the Storage Node shall be uploaded to the Storage Services prior to the time when the Storage Node is permanently disconnected”. Please note these have been cut out of context a bit, otherwise I could be copying a lot of text. The main thing is that the T&Cs infer a completion of the process, without any additional detail.
I’ve no issue with a business retaining fees to at least cover part of the costs incurred if I was to leave without a graceful exit. It would just be better if the fees were better structured and defined.
I’ve worked in projects where I have helped create kill/fud lists to use against competitors and in turn I have had to defend against such lists. The current fee structure is just so open to marketing attacks and could be made so much clearer and fit for purpose in the future.
I not sure about the “it really doesn’t cover the repair costs” part as the only time that would be true is if stripe blocks were regenerated every time they were lost from the system. If the erasure coding scheme is 30,80 then for every 1TB of lost data 30TB would have to be read to allow the regeneration of the lost stripe blocks at a cost of $300 for repair data. The thing is that the system is not going to try and maintain a perfect 80 blocks per stripe of data, instead, it will wait until the number of available blocks for a stripe drops to a set minimum say 45 blocks ** and then regenerate the complete set of 80. This means that 35 nodes can drop off the network before regeneration starts for anyone stripe and so the repair data cost of the regeneration is spread across all 35 nodes that have been lost from the system.
** I can’t find a reference to how many blocks the system considers the minimum at the moment, instead, I have used the example number from table 7.2 in
It may be that after modelling and real-life data the required minimum number of blocks has been lowered or increased.
Thanks, that validates my view/numbers for the node side of the process and may even indicate that they are operating with a 35 block minimum rather than the 45 block minimum I used.
The write up you link too indicates that the real cost comes from the operational design decisions that were at least in place at the time of the write up posting. Google Cloud offers many features, but low cost egress traffic has never been one of them. Even after this years price reductions they are still priced in such a way as to make high traffic generators find another provider. This can be seen in the write up as the node in Hetzner/Germany was paying less than 2% of the cost for egress traffic than nodes deployed at Google.
Last numbers I’ve seen showed repair being triggered at a 55 pieces left. I think this number moves around a bit. But currently held back amounts have been higher than they would usually be due to surge payouts. A node with a lot of space and good bandwidth would usually have at best $50 held back of which half would be paid back after 15 months. That’s normally not going to be enough to do the repairs.
It’s that part of the current ‘process’ that makes the whole things so variable as the end sum is not based on a repeatable calculation, just the average way things have so far played out. These these 2 examples
I have a 16TB node (Pi + external drive), but restricted bandwidth. Storj is going to only use part of the storage space. After first year I up my bandwidth, Storj will now use the additional capacity.
I have a 1TB node (Pi + external drive) and lots of bandwidth, Storj will use all the space. After 1 year I switch to a 16TB drive by migrating the current system. Storj will now use the additional capacity.
The current escrow process does not capture any funds to cover the additional data placed on the node as it is purely based on capacity used during the first 9 months of its life.
Even the speed at which Storj places data on a node vastly changes the amount held in escrow. If it fills up a 16TB node within the first month the final amount in escrow is far higher than if the drive is filled up over say 8 months.
I am not trying to say that the current system is wrong or broken in anyway. Its more the case that it is imprecise and does not really meet the stated aim of compensating the platform for the lose of a node. The real problem will show up in time if nodes start to be run on NAS based units where capacity could be vastly increased over time. The escrow amount will be based on the early capacity, while the active capacity could be many times greater. Do you think anyone with a 50TB NAS node, but maybe just $100 in long term escrow will bother with the graceful exit process in 2-3 years time if they need to bring the system off-line?
I would think that was pretty self explanatory. And yes that means it will have much less held back in the scenarios you described.
I based this on the calculations in my alternative earnings estimator, which is based on real world traffic. That may change, but this is what that currently looks like. No node is filling up 16TB in one month and that’s highly unlikely to ever happen as that means the network would always be on the verge of running out of space.
Every business takes on risk and Storjlabs can afford to take on this risk if the majority of nodes are reliable and make money.
You may be right that a 50TB node won’t bother to get back about 25 USD of money. I’m not even counting on the held back to be 100 at that point. But consider how long it would have taken to get that node to the point where it hosts 50TB and how much money Storjlabs would already have made from that node. It doesn’t necessarily have to mean each individual nodes held back amount pays for that nodes repair. It’s held back + earnings and that evened out over all nodes. Some nodes may cause a small loss, but the majority will make a profit. Almost all businesses have loss leaders like that.