StorJ very slow speeds compared to other S3 compatible

Hi!

I did some tests on restoring databases and files via comet backup. I tested storJ, amazon s3, cloudflare R2 and hetzner storage box.

StorJ was the slowest.
This is a restore, everything else is exactly the same, expect the restore bucket:

StorJ 7:29
Cloudflare R2 03:07
Hetzner box 1:17
Amazon s3: 1:09

To validate, I also restored a very large website with millions of files, and the results were matching to the above, the ratios were the same give or take.

Thoughts?

Thanks
Alex

Have you read ?

3 Likes

Per Comet doco Comet is limited to 10 parallel connections. If they can bump it higher, transfer speeds would significantly improve.

“Comet makes up to 10 network connections to the S3-compatible Storage server.”

2 Likes

STORJ is a centralized S3 with decentralized nodes as storage backbone.
So basically they have an S3 with slightly cheaper storage, but that is not what is expensive about S3.

Yeah, not surprising.
Apparently the native integration with some tunings is “fast”.

What?!

I’m not going to argue the obvious.

Storj works best for large transfers; apps that package data into reasonably sized blobs works fast. I don’t know how comet works, but if they just copy data as is without packaging – it will be ridiculously slow on any backend, but on STORJ – especially so, due to higher front latency.

The native integration is much faster than S3, but requires stable and high performance connection (not in terms of bandwidth, but latency and IOPS)

1 Like

Hello @Slewrate,
Welcome back!

Did you configure it using a native integration (Comet Backup Integration Guide - Storj Docs) or S3 Compatible Gateway Hosted by Storj - Storj Docs?
Depending on the available bandwidth one might be faster than another. I would recommend to check both and use what’s more suitable for you.

However, this is a valid note too:

S3 is centralized and its performance its performance depends on only two factors, how much you are willing to pay for peering and where you are trying to access data. Storage speed is not the bottleneck.

So if you use it from home, S3 will be slow, because STORJ has bad S3 peering.
That is what @Slewrate tested here. I don’t know what ISP @Slewrate uses, but for his ISP (and many other ISPs according to other users and my testing) S3 performance from STORJ is pretty bad.
If you use a DigitalOcean VPS and their S3, it will be faster than STORJ because it sits right next to it and thous will have perfect peering. So STORJ not having compute is a downside. STORJ will never be able to offer 100GB/s like Amazon does. But that is manageable, Backblaze and others also do not offer compute.

STORJ having bad S3 performance (and being very expensive) is not even up to debate. Even STORJ employees agree.
That is why in every performance discussion we have in the forum, they will try to nudge you to the native integration. That is not a critique! This is just how the technology works and STORJ will never be able to change that. That is why they have to ditch S3 sooner or later.

Even if @IsThisOn could be correct about peering, but in case of Storj the S3 Gateway is a distributed service too, so when you use GatewayMT with a global endpoint gateway.storjshare.io, you will communicate with the closest instance for your location (the same would happen for the clients in different location).
The speed would be roughly the same in almost any locations, but of course to have a stable speed you need to use a native integration.

The main case here - you should use bigger chunks (preferably not less than 64MiB), and the wide bandwidth channel. If your upstream or downstream are low (less than 40Mbit and up to 1Gbit, YMMV), then likely S3 integration would be faster (need testing for your location), after 1Gbit the native likely will be faster.
Storj works better when you use it for parallel transfers - basically you can saturate almost any connection, if you would transfers dozens objects in parallel, especially big files (where you can also transfer chunks of each file in parallel too).

Sure it is not only one physical endpoint.
Backblaze also has more than one endpoint.
That does not make it decentralized.
We don’t say Backblaze is decentralized, because they have multiple endpoints.

So it is slower than the competition on almost any location? Or is Europe the bad outliner? :wink:

A couple of questions:

  • What is your traceroute or mtr output to gateway.storjshare.io? Example commands:
    • traceroute gateway.storjshare.io or mtr --report-wide gateway.storjshare.io (if you have mtr installed)
  • What Storj satellite do you use for your location?

We have S3 gateway locations in a number of locations across the world now, and we’re always looking to fix any outliers in BGP anycast routes that look strange.

Note that even the big players can get this wrong sometimes, there’s no such thing as perfect peering for everyone, it’s highly dependent on your ISP as well.

5 Likes

Where do you see, that I said “decentralized” for the S3 Gateway?

sometimes Germany is suffer because of Deutsche Telekom / Germany (and all others who uses their services), transit providers often conflicting with each other. So, yes, it’s possible.

1 Like

Depends on your location and the used protocol (and sometimes hardware). With Storj native - it will be as fast as your bandwidth channel will allow and would be likely the same in any location (with accounting of 2.68x for upload and up to 39/29 ~ 1.34x for download - the libuplink will open 39 connections to download a segment of the file and will cancel all remained when the first 29 are downloaded).
Also depends on what’s do you use, a Storj Global, Storj Select or Storj Private Cloud (SPC) and the protocol, of course. For Global Storj you will get a fastest speed for your location using S3 integration and fastest speed for your location and hardware for Storj native.

You may check Hotrodding Decentralized Storage for your location and hardware.

See the example of the usage: Tribe Social Case Study
and you may check others

Implied here:

if that was not intended on your part, consider my comment on it as a small note.

Telekom is a joke, but it is slow even for other ISPs with great peering to DE-CIX.

That is moving the goalpost. I was specifically talking about S3. I even said that native is faster, but we are talking about S3 here not native. Please stop comparing apples to oranges.

OP @Slewrate did an S3 benchmark.
Can’t we just be honest and say:
“Yes, STORJ S3 is slower than the competition, but there is a faster alternative if you use native integration”?
Simple as that. Then the user can decide, if native integration is even a possibility for his/her usecase.

As far as I can see, I used the word “distributed” with the inclusion “too”, which means that our S3 Gateway is distributed too, as Storage Nodes.
See the point?
Just NOT SO DISTRIBUTED as storagenodes (native).

yeah… but they are a biggest one… troubles…

ok. For S3 it’s still distributed, but could be affected by peering between those big players, you know…

For some cases. As I said - “depends”, on your channel, hardware and location. This is true for a native too, by the way, but no so heavy.

By STORJ, not by nodes.

But that leads to nowhere, we can split hears about the meaning of decentralized an distributed all night long.

My main point is this:

Your communication comes off weirdly defensive and quite dishonest. If a user has performance problems, this forum always tries to blame the user. Or we simply will burry him with links to a wall of text. Because you are so not upfront in your communication, during that read, the user finds out, that the text is not about S3 at all. What if the user has to use S3? There are multiple situations I can think of where native is not an option.
So why is it so hard to be honest and say:

“Yes, STORJ S3 is slower than the competition, but there is a faster alternative if you are able to use native integration”

Actually - both. That’s the point. Yes, we have much less instances of S3 gateways than nodes… but every gateway connects to 110 nodes around the world for each segment of each customer. It’s still distributed, even if you choose a closest S3 gateway for your location. That’s the huge difference - you always uploads to the nodes, not to the central server (even if you report to the central service, which is… surprise - is distributed too) and downloads from the nodes. But yes - through the closest S3 Gateway instance (actually if you close enough to several instances or moving or the routing is changed - the parallel transfers could use a separate gateway instances).

You are simple wrong.

This information ^ can help with the

And also, if they are paying customer, they also have a possibility to use a Storj Select or Storj Private Cloud, if the Storj Global is not the option for any reason.

For your location. That’s the main and complete point.
If the customer can share their location, and if they are the paying customer - the team can try to fix the peering issue (if it’s it), and also - to help the customer to configure software for their case.

We want a dialogue, not the blaming, as you believe.

1 Like

Here’s an example from AS3303 that routes through Frankfurt, Amsterdam, London, New York:

Traceroute to gateway.storjshare.io (136.0.77.2), 48 byte packets
1 192.168.1.1 1.357ms 0.954ms 0.925ms
2 85.1.224.1 1.224.1.85.dynamic.wline.res.cust.swisscom.ch AS3303 2.426ms 2.658ms 2.422ms
3 * * *
4 193.134.95.174 i79zhb-015-ae29.bb.ip-plus.net 2.414ms 2.546ms 1.894ms
5 138.187.130.38 i79tix-025-ae11.bb.ip-plus.net AS3303 2.71ms 2.667ms 2.933ms
6 193.5.122.133 as3303-as6453-zrh-001-ae3.ce.ip-plus.net 3.234ms 3.425ms 3.191ms
7 195.219.87.53 if-ae-33-2.tcore2.fnm-frankfurt.as6453.net AS6453 95.211ms 95.519ms 95.24ms
8 * 195.219.156.150 if-ae-29-2.tcore1.fnm-frankfurt.as6453.net AS6453 93.5ms *
9 195.219.194.149 if-ae-6-2.tcore1.av2-amsterdam.as6453.net AS6453 93.93ms 93.635ms 94.331ms
10 195.219.194.6 if-ae-2-2.tcore2.av2-amsterdam.as6453.net AS6453 92.898ms 93.224ms 94.912ms
11 80.231.131.160 if-ae-14-2.tcore2.l78-london.as6453.net AS6453 94.878ms 94.872ms 95.834ms
12 * * 80.231.131.2 if-ae-2-2.tcore1.l78-london.as6453.net AS6453 96.776ms
13 80.231.130.26 if-ae-15-2.tcore3.njy-newark.as6453.net AS6453 95.44ms 95.663ms 95.366ms
14 216.6.90.13 if-ae-0-2.tcore1.n0v-newyork.as6453.net AS6453 97.373ms 97.621ms 97.467ms
15 216.6.90.50 AS6453 92.277ms 92.916ms 92.325ms
16 169.150.194.93 vl212.nyc-cyx-dist-1.cdn77.com AS60068 95.039ms 94.411ms 94.461ms
17 * * *

Here’s one from AS5432:

Traceroute to gateway.storjshare.io (136.0.77.2), 48 byte packets
1 192.168.1.1 1.89ms 1.692ms 1.767ms
2 10.24.145.28 9.975ms 8.683ms 8.606ms
3 * * *
4 91.183.242.132 ae-74-100.ibrstr6.isp.proximus.be AS5432 12.909ms 12.021ms 11.634ms
5 195.219.227.150 ix-be-13.ecore1.b1d-brussels.as6453.net AS6453 11.943ms 13.07ms 12.809ms
6 * * *
7 * * 195.219.194.6 if-ae-2-2.tcore2.av2-amsterdam.as6453.net AS6453 98.953ms
8 80.231.131.160 if-ae-14-2.tcore2.l78-london.as6453.net AS6453 102.9ms 100.798ms 100.965ms
9 * * *
10 80.231.130.26 if-ae-15-2.tcore3.njy-newark.as6453.net AS6453 96.828ms 97.172ms 105.479ms
11 216.6.90.13 if-ae-0-2.tcore1.n0v-newyork.as6453.net AS6453 97.051ms 111.344ms 115.699ms
12 216.6.90.50 AS6453 114.821ms 110.99ms 115.112ms
13 169.150.194.93 vl212.nyc-cyx-dist-1.cdn77.com AS60068 115.684ms 415.609ms 116.844ms
14 * * *

From Amazon (AS16509) in US West 2:

Traceroute to gateway.storjshare.io (136.0.77.2), 48 byte packets
1 34.221.151.243 ec2-34-221-151-243.us-west-2.compute.amazonaws.com AS16509 5.551ms 6.35ms 6.334ms
2 100.65.20.32 18.126ms 11.756ms 11.882ms
3 100.66.9.10 14.363ms 4.215ms 2.322ms
4 100.66.11.64 15.891ms 72.116ms 13.396ms
5 241.0.10.202 0.458ms 0.35ms 0.301ms
6 242.0.65.197 6.53ms 1.049ms 0.687ms
7 240.0.168.0 20.118ms 20.091ms 20.1ms
8 242.2.27.67 20.499ms 21.134ms 20.511ms
9 15.230.88.88 20.252ms 20.948ms 21.051ms
10 99.83.119.125 20.027ms 20.542ms 20.246ms
11 193.251.133.3 ae304-0.ffttr7.frankfurt.opentransit.net AS5511 162.993ms 162.962ms 171.01ms
12 193.251.141.214 datacamp-13.gw.opentransit.net AS5511 161.527ms 161.453ms 161.533ms
13 169.150.194.49 vl215.fra-cyx2-dist-1.cdn77.com AS60068 164.055ms 436.453ms 164.047ms
14 * * *
3 Likes

Thank you for the info, I’ll pass it to the team.

These traces are helpful, thanks!

Looking at them, it appears some ISPs in Europe using Tata (AS6453) as a transit provider are sending traffic to our servers in New York, instead of Frankfurt. We should have a direct connection to Tata on the Frankfurt server, but it doesn’t appear to be working. We’re looking into that now.

The Amazon trace going from US West to Frankfurt is a known issue, which we are investigating as well.

8 Likes

The routes in Europe should now be fixed to go to the right place. How are they looking for you now?

2 Likes