Update Proposal for Storage Node Operators

heunland · March 13, 2023, 10:58pm

Please post questions for the Twitter Spaces on the aforementioned thread, it is important so we know who asked each question so we can properly follow up and you will get tagged with an answer if it gets posted on the forum rather than or in addition to getting answered live, as it may be important for others to be able to easily refer to the answer, too (I am not a mod so I can´t move the post for you, sorry.)

MattJE96011 · March 13, 2023, 11:05pm

Thanks for the heads up. I’m sure @Alexey will take care of it. He’s always catching me sticking things in inappropriate places.

Knowledge · March 13, 2023, 11:28pm

I took care of it. Lots of questions for this Twitter meeting.

heunland · March 14, 2023, 1:36am

@MattJE96011 An answer to your question has already been posted by our VP of Finance here.

Alexey · March 14, 2023, 4:05am

they are clustered already, and they not a bottleneck, data is transferred directly between customers and nodes when they use a native integration, and between gateways and nodes if the customers uses gateways. So perhaps you mean clustering for gateway. This is already done too.

Alexey · March 14, 2023, 4:18am

Hello @weyanz,
Welcome to the forum!

Your node will not have the same latency for the each customer. This metric is hard to confirm or measure. Measure the latency between satellites and nodes doesn’t make sense, data is transferred between the customer and the node. So we need to collect reports from the customers instead (and we do, but not a spent time, only spent storage and egress). However, this will varies even for the same customer depending on their circumstances like how much their bandwidth is loaded, so likely this metric will never be confirmed.

snorkel · March 14, 2023, 6:37am

After reading many posts, I see that we are still don’t understand how Storj network is organised, how all the parts work, what is their purpouse, who comunicates with who, what services are there, what is paid for and what is free, what Storj is paying for, what servers has, etc etc. I didn’t read all the docs and whitepapers, so maybe the info is out there, but I see some confusion even from the smarter members.
Can someone make a For Dummies guide to Storj network? To clarify all these things and to easely refer others to it?
Thanks!

jakesteele · March 14, 2023, 8:01am

Trying to calculate this out,

I have 8x 12 TB Ironwolf Nas in Raid6 thats,

7.8W per drive x 8 @ $0.0950 per kWh = $0.19 cents per day for electricity.
To add unlimited egress on my home internet = $20.00 (The 1TB included is being used by plex and seed box, using that to count the rest of my electricity as a net zero.)

Roughly $5.70 per month electricity for the hard drives.

Total additional monthly cost of my setup is $25.60 per month
(Not including server costs, server electrical costs etc. since its a 24/7 seed box/plex server ill net zero that part)

I have the 2.5G Up 2.5G Down WAN

At a $1 per TB and $4 egress, id need to have at least 21 TB filled and 1 TB of egress per month and nothing held back before I break even for the additional costs of running Storj. To be fair thats not counting the wear and tear on my hdds, or the increased computer electricity.

If I look at purely electrical costs, I would be profitable with a much lower amount stored to be fair to the project. But you have to consider I’d need to hold back around $260 over 5 years to replace a drive if needed.

Would be great if Storj found became more popular for streaming media, TURN services, maybe localized CDN services, image sharing, something where data would have a lot of egress even if more data isn’t being stored.

With the reduced prices, you should not hold any amounts back. You are making it much harder and longer to run a break even on investment, you need to give newer nodes a break on the hold back amounts.

VhkjKadwicMoJRmpQCfT · March 14, 2023, 8:15am

Interestingly enough, Storj won’t be the first one to attempt to implement a P2P “home level” CDN network. I’ve encountered Youku’s (China’s Youtube) router several years ago, where they’ve tried to use the router as a local CDN node, and allowing the owners to take a revenue cut from it. Essentially utilising idling bandwidth for streaming cached videos and what not.

Edit 1:
With the potential introduction of video streaming tax, it might be interesting to see how suitable Storj would be to provide such service?

BrightSilence · March 14, 2023, 9:02am

Just to clarify, this is 2.7M STORJ tokens and it’s a total for “service provider” costs. I’m not sure this is the correct number to refer to as the explanation of this line is the following.

In addition to Storage Node Operator payments, we make payments to certain service providers (e.g., community leaders who monitor our various forums, respond to questions from users, and perform other community-related tasks; bug bounty participants; consultants; contractors) in STORJ token (line 11).

That doesn’t sound like edge services to me.
I’d be more interested in the 22.5M “other” line.

Line 14, “Other,” is reserved to report activity that doesn’t fall into any of the other categories, including, for example, non-routine payments to service providers and carbon offset program payments. As noted above, in Q4 ‘22, 22.5M STORJ tokens were used in payments that included the repurchase of company shares, and general operations and liquidity purposes. To provide additional liquidity for general operations during uncertain economic times and in periods of growth, we also are liquidating a portion of our reserves on non-US exchanges through a partnership, and these flows are disclosed in this line item.

source: https://www.storj.io/blog/storj-token-balances-and-flows-report-q4-2022

Your use cases miss native operating costs for running the satellites and repair processes. Both of which are quite costly and would remain even if none of the customers use edge services. Additionally, it misses costs related to giving a cut of income to channel partners who onboard customers to the network. This makes some suggested numbers look more reasonable than they are. I’m pretty sure there is currently no way to make it profitable for Storj Labs if they pay more than the highest numbers proposed in this thread. Even if we take edge services out of the equation.

But other than that, it’s a nice summary for people who aren’t yet aware of the basic cost structure!

BrightSilence · March 14, 2023, 9:05am

Just curious, what costs do you include in this?

Krimarai · March 14, 2023, 9:31am

Well, new payout amounts, as a standalone, provide little meaning in a sense of resulting revenue of an average node by the time those amounts are implemented (they do look scary though).
Does Storj expect growth of customer data on that much scale, so that it would balance out loss of revenue for node operators?
Overall, my goal as a node operator is not to lose (and to improve in perspective) resulting revenue amount. If it is possible even with new proposed payout amounts by leveraging other factors, then why not? All I want is a profitable business model on all levels and for all participants of project, and it is one of the most difficult tasks of Storj to design such model.

hatred · March 14, 2023, 10:25am

I see that the operators are not fully aware of the situation. They wrote to you in the first message - there will be not only a reduction in payments, but also a reduction in synthetic data! Please note - it is synthetic, not test (so they have synthetic data + test data + real data). And no one has said how much synthetic data is being poured into operators now.

It would be correct to start with the question: what percentage of synthetic data? I suspect that at least 90%, and maybe more.

Here a person writes that he has 250 TB of data, but it must be understood that with the reduction of synthetics, 2.5-20 Tb of these 250 TB may actually remain.

BrightSilence · March 14, 2023, 10:46am

I think this is a distinction without a difference. What used to be referred to as test data hasn’t really been used for testing. So synthetic load is just a more accurate descriptor. I have no reason to believe this is something different from what we used to call test data.
Synthetic load is just more correct since this data was never solely there for testing purposes (also node operator incentive, space reservation etc). This means that the numbers posted earlier in this topic should still be accurate.

hatred · March 14, 2023, 11:04am

Perfectly! You have one point of view, I have another. Only the company itself can judge if it explains everything and provides evidence. Fortunately, we have a blockchain here, and you can double-check everything yourself. At least it can be uploaded to the dashboard dune.com and count with graphs and charts.

SGC · March 14, 2023, 11:46am

it would always make sense to have some level of test and synthetic data.

i duno how StorjLabs does that, however this is some basic of what i would expect.

Test data would most likely be required to verify everything is working correctly down to a byte level, since the customer data is encrypted StorjLabs would be working blind without some level of test data for integrity verification.

Synthetic data would be a good idea for stress testing and onboarding of larger customers, since it could be difficult to gauge how much remaining performance and capacity exists.

thus when onboarding large customers or running into hardware limitations, then synthetic data loads can be adjusted to ensure Storj DCS functions reliably.

ofc the ratios of how much synthetic / test data in relation of the actual customer data would change over time, as the network and customer data grows.
the avg DCS customer will have some very clear requirements, which is all that needs to be accounted for, so that the test / synthetic data gives StorjLabs a time buffer, so they can react to fix issues, such as doing surge payouts for SNOs to upgrade the network, or raise SNO payouts.

another factor would be that StorjLabs would want to stress test for extended periods, especially in the earlier ages of the network, so they learn how everything behaves.
just like you and i might attempt to break something after we just created it, to verify it works correctly and will last.

i don’t think test and synthetic will disappear, even if it might only end up being a few % of the entire network…

SGC · March 14, 2023, 11:54am

it is a bit more described in the details.

22.5M tokens for the repurchase of company shares and other payments including general operations and liquidity purposes not otherwise described above.

BrightSilence · March 14, 2023, 11:56am

I don’t think this is correct. Part of the reason they are making this proposal is to gauge how viable certain changes are for node operators. If a reasonable cost/income analysis of reasonable node setups shows they aren’t viable, that is important information for Storj to have. They can’t work without nodes, so they have a big interest in keeping reasonable node setups profitable.

Of course reasonable is a key word there. If your setup is unreasonably expensive, like the mango example you mention, Storj Labs will likely just ignore it.

I don’t think it will be. Satellites do a lot of work and store quite a bit of metadata. And repair processes also require to upload/download actual data. Luckily they have been moved to more affordable hosting, but you have cost for node payout as well as bandwidth cost for hosting the services and compute for erasure encoding/decoding. My guess is that these still have a fairly significant impact on costs. But I’d love to have better information about that…

Literally quoted that (and more) in my original post.

hatred · March 14, 2023, 11:57am

Absolutely true! The number of nodes determines the network capacity and bandwidth. If there is about 10% of real data on the network, then the company can survive with 2000 nodes instead of 22200. They can change the model, and just have 200 nodes around the world, each for a few petabytes.

This is business, no one keeps us here. Plus, there will always be those who have zero cost of electricity and even hard drives (disposed from data centers), who can keep large nodes for years just for the sake of interest.

hatred · March 14, 2023, 12:12pm

Now there is a period when large companies are decommissioning 2017-2019 hard drives, volumes are already over 10 TB there, and it is possible to build reliable arrays of the RAID60 level with a volume of hundreds of terabytes.

Some Vendor-locked models like HGST, which are not flashed, do not cost anything at all. I have already written - the cost of data storage is well, if not zero, then very low, far from $ 1 per Tb. Now you can buy used hard drives at a price of $ 5-7 per terabyte, those who sell them buy “by weight”, “tons”, “vans”. They generally have an incoming price of about 0.

And if you build an array on old 1-TB disks that people just throw away without even trying to sell, and you collect 200-300 pieces of them, then you will have a free array at all. It remains only to put it in a place with free electricity, and there are plenty of such places!

Before the Chia mining boom, a decommissioned 24-disk LFF shelf cost about $ 50 per purchase, at the entrance about $ 10. A good cable cost more than a disk shelf with two expanders and two power supplies.

Yes, it is possible without disk shelves at all, as with Chia miners - just connect hard drives directly to the controller (LSI 9264 costs about $ 20 together with BBU), plug 4-5 such controllers into the server, or connect 100-200 disks to one server through an expander ($30). Hobby is one thing, industrial token mining is another.

It’s one thing when a company needs to show business growth - for investors or journalists, and quite another thing when a company needs to make money. We were in the first phase, now we are moving into the second, and we must be aware that if you do not know how to save money, if you have expensive electricity, there is no way to buy old decommissioned hard drives, then most likely you will be left behind.