How many internet 10gbit lines you need to keep 1PB+ of data with new pattern? Do the math
Luckily for me im a network engineer so thats the least of the problems, but this data is artifical so upgrading to 100 gigs is out of questions at the moment. But need to redesign the storage. I knew none of storage when i first started.
I saw when you posted hitting 2Gbps ingress earlier. With the size of your setup⦠and kinda averaging/estimating what traffic could be⦠do you have a feel for how fast you may have to expand? One new 20TB HDD per month? Two?
I dream of having the first-world-problem of having to buy a new HDD for Storj every month
I avged 4-5 gbits mostly cause my nodes are capped at 50mbit. Also my sas controllers can only do 6gb/s and spinning disk is not known for being fast ye?
Iām with @Roxor on this one: Iām sitting on the side until my nodes get filled. After they get filled, Iāll start adding drives at a rate Iām financially comfortable with. When those get filled, Iāll add a 2nd internet line.
Rinse, repeat, popcorn.
Which other projects are comparable?
Iām aware of Chia and Filecoin but Chia is like $0.12 per TB and you need pretty beefy hardware to plot, and Filecoin has high hardware requirements and investment.
Everyone is a little stressed out. Letās take a step back and evaluate things. Storj isnāt asking anyone to buy anything right now. We did ask that if you had extra capacity that you could bring online temporarily, that would be helpful to know. Otherwise, continue working with what you have.
If some.of the customers sign contracts and begin heavy use of Storj services, we will discuss what a base node configuration looks like and provide information at that time. There is ongoing discussion both here and internally on what that should look like. I think everyone would agree that with increased usage, system requirements will change, but we donāt want to put the cart before the horse. Sometimes sales deals get very close and then one reason or another they fall through. So, letās maintain what we have, but obviously work on adjusting crashing nodes and unworkable configurations now while testing is happening.
We will continue to relay needed information as milestones are met.
Nothing has changed. Good old protato will still work. No enterprise gear needed. In fact I am far away from enterprise gear myself. The only device I might call enterprise is a netgate 3100 router.
Everything else in my setup is the cheapest hardware you can think of. And still it works. I am not sure if my router will be able to deal with 500MBit/s but if we really get to that I would be looking at 240$ payout. So not a big deal.
I am trying to explain to you that this artifical load is simulating some existing usecases. So the moment these deals are getting signed this artifical load will turn into real load. There are some aspects that our atificial load doesnāt match but it should be close enough.
First of all HDDs donāt age that way.
Second what extreme load are you talking about? The highest I have seen was 500MBit/s and if that goes to a single disk it will be full in a short time anyway. Just by the numbers there is no extreme load for the disk.
We donāt have control over that. If these customers wants to upload with a TTL than a TTL it is. You can either adopt to it or argue with me about something that I canāt change anyway.
So you are saying I should say our sales team to not sign any deals? I see the bipolar disorder on your side. (just to give back the compliment)
You donāt have to trust me at all. Please do not trust me. I am claiming that with all the code changes we have 6 times more throughput. Donāt trust me on that. You can verify that on your end. I am claiming that these customers will upload with a TTL. Donāt trust me on that. Verify it on your end. Depending on how skeptical you are you can wait for these customers to sign deals and start uploading. If my claims are correct you will see a similar upload pattern. Donāt trust me on that. You can verify that on your end.
Still donāt know what my attitude has to do with that. What is that going to change?
Thanks for information @Knowledge
no one questions this. you and your team have literally revolutionized the performance of every single node. a step forward never seen in 5 years since the launch of my first node.
Consider the alternative. Would you really prefer only PR cleansed communications? Yeah, the way communications happen on this forum are fairly unique, because most companies would stick to PR communications reviewed by 5 departments and stripped of any substantial information.
But look at all the information provided to us in this topic. We get a heads up about testing, explanations about testing patterns and how they resemble prospective customers patterns. Info about changes made for different tests and the results of those tests. This is a ācareful what you wish forā situation. If people here are going to demand PR like statements, do you think we would know what we know at this point? Do you think it will make your setup planning any easier?
I for one highly appreciate the way @littleskunk has communicated despite the hours heās clearly been putting in. Heās taking time to keep us well informed. And I will gladly take the personalities as they come. Even if itās a little blunt at times.
Letās not do this though. I know youāre responding āin kindā, but thereās a big difference between saying it about a company or saying it about a person. Even though neither have any place on a public forum, there was no need to escalate on this.
That is still happening in the background. The uploads we are simulating on SLC are on top of what ever you get from the other satellites. It is not a replacement. What is going to change is that this additional load is one day shifting from SLC to US1 but it would still be on top of what ever US1 is already uploading without a TTL.
If you have some kind of traffic constrain you could limit your node size befor hitting your traffic limit. I understand that you would like to use most of that traffic for data without TTL but if I look at the upload rates this is kind of useless. Any idea I can come up with will at least reduce your grow rate on US1 to 0 so you can as well limit the size of your entire node and end up with almost the same outcome.
If you are able to the best grow rate you can get is with a high success rate. So you could say the TTL uploads work like a magnet. The more TTL data you get later from US1 the more non TTL data you also get. With our current choiceof2 factor you can reach a maximum of 2 times higher grow rate of non TTL data. Next deployment we might increase that to choiceof4 which would give you a 4 times higher grow rate of non TTL data. So there are some advantages accepting the TTL data.
I need to be a bit careful with this. To my knowledge it will be a constant flow of data. What ever the TTL is it will get replaced by another upload with a similar TTL. There is also not a single TTL for all the data. There will be some data with a shorter TTL and some data with a longer TTL. Our current uploads should match the TTL that we expect to be most dominant.
The flow of data will vary over the day. In the past few days we tested the peak load that we need to maintain for a few hours per day. There will be hours with a much lower upload rate.
I think that is all I can say for now. Your own node should give you some of the answers you are looking for.
Thank you, I understand and appreciate your efforts.
What I think that says (and perhaps not everyone is understanding this) is that on the 1st of the month some data will be uploaded (letās say with a 30 day TTL). This data will be deleted on the 1st of the next month (letās go with that). On the 2nd day of the month, some more data will be uploaded, which will be deleted on the 2nd of the next month, and so on.
I get the feeling that some people think that the data will be a huge burst of data on the 1st of the month, then quiet down, data stays for a month and this cycle repeats.
Assuming again (Iām not in @littleskunk 's mind) that there will be indeed a constant ārotation of dataā, it doesnāt matter if the data is TTL or not. Even if the TTL is less than a month, you still get paid proportionally anyways.
I think TTL is even better, as it will be deleted without sitting in trash for 7 days, so space will free up almost instantly and you are ready to get more data again.
today it is much worse, client delete then we wait several days for bloom filter then 7 day totally you get 10-14 days not paid for data and cant get paid data.
I vote for ācabbage nodesā. You know, layersā¦
lol. Or lettuce ā as in All fluff no substance
Yes I fully agree. It took a long time to get there. Why wasnāt it done earlier? The answer is because there was no need for it. Well to some extend yes but the current improvements go much further than we would have implemented lets say a year earlier.
Something we also learned in the last few tests is that there is a difference between maximum throughout and fast uploads. We can offer both at the same time just not for the same upload. So in the furture there might be one customer that needs fast uploads but limited throughput. Bestofn will dial in on the fastest nodes and ignore almost all slow nodes. Good for performance but only so long you donāt fill the fast nodes to the last byte. So for maximum throughput it is important to stretch out the resources of the fast nodes and try to max out the resources of slow nodes. Ideally let all nodes participate. Choiceofn does that.
Also the bitshift success tracker has some incredible thought process in it. The first percent success tracker would be better on finding and remembering the fast nodes while the bitshift tracker is kind of dump and forgets about the performance of the nodes real fast. It works more like TCP congestion control. Oh there is a new node. Let me try to select it a few times. Ups it missed an upload. Let me dial down a bit. I canāt remember what this node can handle how about I scale up the request rate. Oh there was another miss I will dial down again. This dump behavior turns out to be incredible effective and reacts quickly to changes. If I start a netflix stream the bitshift success tracker will notice it and dial down quickly. If I stop the netflix stream it will intensionally forget about my history and increase the upload rate based on the resources I have available in that moment.
So my conclusion is only thanks to the current usecases we are able to make these improvements. We are talking about a better node selection since a few months now but all we would have done would have been a choiceof2 and call it a day. The current usecases are forcing us to question if that is enough and so we discover all the other improvements that we wouldnāt have found otherwise. It is the right time for these improvements.
Oh and the next challenge is waiting for us. Watch the execution times of the file walker. My math is telling me that it might escalate in the next weeks. We talked about possible mitigation but it is hard to tell which solution works best without seeing the problem. So for now we can only wait until the file walker gets too expensive to run and needs improvements.
one more question, we see insane speed on our end, but what speed did you managed to get on upload side? max and average?
Correct. The TTL has an impact on how much bandwidth you need per $ payout. So with my 250MBit/s I an take an inflow of 81TB but if that data has a TTL of halve a month I will end up with just 40.5 TB on disk and that payout while still consuming 81TB of bandwidth. (current dominant TTL you can get from your node)
There is an advantage. I would get this 40.5TB (or 81TB with a TTL of a month) in the first month after vetting. I donāt have to wait. It is a short feedback loop and mistakes like getting disqualified are getting less painful.
Downside is that I would get these 40.5 / 81 TB only as long as the inflow keeps the same. If these customers run away the fun is over quickly. If you decide to add more nodes that will also be subject to the shorter feedback loop and reduce my grow rate.