Let's talk about the elephant in the room: The Storj economic model (node operator payout model)

IsThisOn · March 6, 2023, 4:10pm

My setup was just the very basic setup by using the Backblaze or STORJ cli with all the default settings.

@Alexey Once again, please present me a single good case study. NOT a page full of supposedly good use cases.
That is like me saying:
Toshiba produces the best drives. Here is my source: reddit.com
That is not a source!

But for one last time, I will take a look at a use case from your page and randomly select 3.

Lets have a look at Ultimate division. Blablalba, NFT bla bla, Tokens blabla, founder of Chickenfish Games blabla, Web3 blabla, STORJ DCS saved us 30% over AWS while delvering the same level of performance
Now that is a use case! Now how much data do they store? How much egress? We don’t know.
How about the project? Roadmap only shows 2022, Discord is down, last tweet was over a year ago. Still think this is a great testimonial?

Lets continue with University of Edinburgh. Sadly the case study is entirely missing on the page

How about BOONJI? Apparently they even save 90% over AWS. Looks like a sneaker designer that sells some NFTs on OpenSea.

These are not customers that represent a big customer group nor do they have large storage needs.

Vadim · March 6, 2023, 4:25pm

I agree with you that, paralelism cuold be in default lerning link with description.

snorkel · March 6, 2023, 6:37pm

The best settings for performance should be the default settings. Many testers just don’t bother reading manuals and tweak guides; they do a quick test with default settings and than fill the forums with wrong results and impresions.
Storj should pay more attention to details and first impresions. I don’t always agree with everything IsThisOn says, but in these case he is right. I don’t know if other service prividers makes tweaking the connection a must.

bre · March 6, 2023, 6:44pm

Information regarding University of Edinburgh can be found here

BrightSilence · March 6, 2023, 6:46pm

I agree, but there isn’t one setting that would be best for everyone. Since uplink does encryption erasure encoding and uploading, it depends on CPU, network speeds and HDD speeds as well. However, just doing one sequential thread is probably not good for any system. I’m sure there is a happy medium to be found. Slightly conservative by default, assuming high power hardware and connections is not a good idea either. They’d have to do some testing on different hardware and connections to find that sweet spot.

But for the best performance in your specific situation it’ll always be vest to do some personal tuning.

Jacob · March 6, 2023, 7:26pm

We can do more in terms of detecting when either packet loss is happening or dynamically selecting parallelism and turning it up and down throughout the download to make sure the additional parallelism is increasing throughput.

It isn’t currently on our roadmap to do so, we have some other performance, tuning, and optimizations you can see in the milestones referenced on this roadmap item. Performance Tuning & Optimizations (PTO) · Issue #45 · storj/roadmap · GitHub

If you don’t have a dynamic component, there isn’t a single parallelism default that is right for all use cases, so we have to pick one.

BrightSilence · March 6, 2023, 7:32pm

Oh yeah, that would absolutely be a lot better, but of course that takes more effort to implement. But I agree that dynamically scaling parallelism would help a lot. That said, even then having it at 1 at the start is probably not the best default. Perhaps you could default to the number of CPU threads and then scale from there. I’m gonna have a look at that link, thanks.

Dominick · March 6, 2023, 7:40pm

I can help you tune are realize excellent performance. Storj is different, given that we erasure encodes across tens of thousands of nodes we can get incredible throughput from a bunch of nodes that themselves may not be high performers. As @BrightSilence noted personal tuning is important.

Let’s focus on helping you see a great test, then perhaps let’s move into an on-topic thread where we can discuss how to impress users more easily.

For your test I didn’t see a command or tool listed. Please feel free to share but I’ll provide proposed best paths below.

Upload 4GB Single file
Rclone using generic s3 + storj selection. Tuned to 16 concurrency which will use 1gb of ram on the client. You could in theory use as high as 64 but you will likely find peak throughput around 16.
rclone copy --progress --s3-upload-concurrency 16 --s3-chunk-size 64M 4gb.testfile remote:bucket

Download 4GB Single File
Let’s use uplink this time so you get to experience another tool. Like with the upload scale this up and down. Do make sure you have a good CPU and please monitor CPU usage as you scale.
uplink cp sj:/bucket/4gb.testfile /tmp/4gb.testfile --parallelism 16

My goal is to help you see it go fast not debate complexity, which could become its own thread if of value. We work closely with our customers (small and large) to help them get the best value out of the network.

Vadim · March 6, 2023, 7:51pm

For the begining about parallelism cold be written here Upload an Object - Storj DCS Docs because it is first thing people open and try. Also add some explanation about what it gives, and also may be some recommendation about speed and number

IsThisOn · March 7, 2023, 12:52pm

No thank you. I am not interested becoming a STORJ customer, nor do I have performance needs. My point was simply, that I don’t believe the performance claims made here.

This is the command I used

uplink cp ~/Desktop/cheesecake.jpg sj://cakes

uplink cp sj://cakes/cheesecake.jpg ~/Downloads/cheesecake.jpg

Not sure if there is a problem on my end but I don’t see button to download the PDF nor to watch the webinar.

bre · March 7, 2023, 1:08pm

Hmmm you might be right that it’s a problem on your end because I just tested and both pages work fine for me 16 minutes after your post time. No changes have been made to the pages in the interim.

You need to follow the instructions on each page, filling out all the form fields and clicking the Submit button. Once you follow these steps, you should have access to the PDF and the Webinar.

Please see attached screen shots.

BrightSilence · March 7, 2023, 3:22pm

Nor are you willing to even test suggestions apparently. I mean, you’re not gonna see the best performance if you don’t try these suggestions. Even when the example to include parallelism is literally in @Dominick 's post. It’s literally a matter of adding --parallelism 16 and maybe trying some other numbers for large files. (Or not mentioned yet, using --transfers 16 for many files or a combination of both)

You’re getting a lot of info/help offered from people, but you keep swatting them down…

I’ve said this before in other places… but you should really remove those forms… It just adds a hurdle and many will just turn around and leave. Furthermore, I keep filling out those forms over and over to just read some new info. You have my info many times over (and admittedly after a while, a lot of nonsense trash data, because I got tired of filling in actual information). This just doesn’t seem like a good experience while looking for a good storage solution. At the least set a cookie so you don’t keep asking the same person over and over.

jammerdan · March 7, 2023, 3:31pm

Exactly. It’s a terrible user experience.
I would vote for integrating something like this instead:

bre · March 7, 2023, 6:44pm

Thank you for explaining your experience of this as a user.
The cookie idea makes total sense.
I’m forwarding your post to the team that makes these decision.

bre · March 7, 2023, 6:49pm

I hear you.
I’m forwarding your post to the decision making team as well.
Yes I remember your Calendly suggestion conversation and brought it to the team.
The mods can relay suggestions but we can’t implement them unilaterally.

Maybe there are some ways the friction can be reduced getting everyone to the information they want. It’s always good to ask.

jammerdan · March 8, 2023, 8:32am

The thing is if you are a user and simply want the information for decision making, then it is totally annoying that you have to fill out forms plus you don’t know what the company will do with your data.
Recently I was in the same situation - a simple request for information - and that company took the chance and added my email to their newsletter which they send out 3 times a week.

Of course I do understand that as company you want the contact information but as user I don’t want to get annoyed otherwise I might abstain from entering my data and simply go to Amazon. I don’t know what the perfect solution is for both sides. As user I would prefer to have an option to get the information without being forced to fill out forms with my data.

IsThisOn · March 8, 2023, 11:53am

Yeah, pihole blocked it. I did not see the form, just a blank page.

Glancing over it, Edinburgh is not a customer but one prof did a “study” that shows the advantages of parallelism and that they can download 700MB/s from STORJ? And that this somehow shows the peering advantages from a decentralized network, while I beat these numbers from my Europe residential ISP to Backblaze S2 in the US? But maybe for people with worse peering than me, that could be a good result. In theory, I agree with his argument, I just don’t think it works out in the real world.

I don’t care about tuning stuff I don’t use. I care about how it performs out of the box.
The only reason why I even bothered to do a benchmark was that I could not find any benchmark.
While the Edinburgh benchmark does not do STORJ any favor, at least I finally got one.

Again, I don’t want help. I want testimonials. I want use cases. Real numbers from real people.

I will try this again and again, maybe someday I will get a straight answer:
Can you please show me just one single customer that has high storage needs, high egress, or high-performance needs and uses STORJ because of that? You know, like just one single testimonial from a none Web3 startup niche project?
Something like:

“Hey this is company X. They do X *. They need roughly X TB of storage and X TB of egress per month. Because they do X, their performance needs are X. That is why they decided against AWS Glacier instant retrieval or Backblaze but went with STORJ”

*one sentence about what they do, please. Don’t care if it is very broad or vague, I don’t need to read another page filled with empty buzzwords.

Update: I have an even better idea. Because @BrightSilence unlike me actually read the testimonials, how about you just present me the best one? Than we have a real example and can debate if this is a good testimonial.

Stob · March 8, 2023, 12:00pm

What you’re asking for is commercially sensitive data. Please provide an equivalent testimonial for AWS or Backblaze.

BrightSilence · March 8, 2023, 12:05pm

Does this strike your fancy?
https://www.businesswire.com/news/home/20211201005267/en/ Storj-Showcases-Unmatched-Performance-and-Throughput-via-the-Exceptional-Parallelism-Capability-of-Decentralized-Cloud-Storage

What users are saying

“Pocket Network is redefining the Web3 infrastructure landscape for a fairer and censorship-resistant internet,” said Michael O’Rourke, CEO, Pocket Network, Inc. “We chose Storj DCS as our storage layer because of its innovation in decentralization - and for our shared values of a trustless, sustainable, and globally distributed architecture. And bottom line, the performance blew us away. The parallelism of Storj DCS enables ultra-fast blockchain Fast Sync. The accelerated sync times and added reliability from Storj helps miners create a more diverse set of endpoints, and contribute to a larger, more secure blockchain.”

“Not only does Storj DCS’ built-in global distribution network help our organization scale up our AOSP ROM software distribution, Storj DCS enables 50,000 downloads per month across 60 Android device builds around the world - more than 10x faster and at 1/10 the cost of most object storage providers,” said Rohan Hasabe, PixelExperience Core Team Member.

“Usually, higher performance and throughput are not associated with decentralized storage solutions; however, Storj really has delivered an industry first – enabling multi-GB speed and providing us with unmatched performance, increased parallelism, redundancy, and resiliency – and all cost effective,” said Dr. Antonin Portelli, University of Edinburgh Research Explorer. “We are generating large datasets for research, and confidence around the resilience and geographical availability of the data is critical. We must ensure that all of this data can be stored safely, and also retrieved quickly and often. Storj has definitely checked all those boxes for us, for the next decade and possibly more.”

BrightSilence · March 8, 2023, 1:02pm

I figured I’d also do the test you won’t do. These are my results for uploading and downloading a 4GB file on a 1gbps connection.

These show first uploads with parallelism set to 1, 8 and 16 respectively and then download for the same thing.

You can see that for upload it is able to utilize almost my entire connection speed with parallelism set to 8 and 16 saturates it completely.

You’ll also notice that downloads are a little different. I ran these tests on my Synology and the downloads were so fast that my measly low energy 4 core Xeon D-1527 couldn’t keep up.

Keep in mind that this network usage represents traffic of erasure encoded pieces and thus effective transfer speeds of the actual files is lower. For upload there is an expansion factor of about 2.8x and for download about 1.5x. I’m limited by my network connection here. This could go much faster on faster connections.

All these tests did was expose my own bottlenecks, not Storj’s. Hope that helps.