"List objects" with a delimiter is horrendously slow

cdhowie · December 25, 2024, 12:48am

I’m attempting to rclone copy a Backblaze B2 bucket to a Storj bucket. The bucket has about 160k objects.

What I’m finding is that listing objects with a delimiter (what both rclone copy and rclone lsd do under the hood) is excruciatingly slow. rclone lsd completes in a few seconds on Backblaze. On Storj it’s been 24 minutes now and it’s still not complete.

I’ve tried using both the Storj-hosted S3 gateway and rclone’s native Storj support. Both take forever, but using native Storj support causes so much memory to be allocated by rclone that the OOM killer eventually steps in and kills rclone.

This is making me rethink using Storj at all.

Has anyone else experienced this? Is there some configuration I’m missing to make performance of a bucket of this size even remotely tolerable?

Update: I tested Backblaze and the operation took 10 seconds to list objects with a delimiter. On Storj I am approaching 1 hour and 40 minutes and it’s still not done, and the Storj bucket only has about 80% of the data.

Alexey · December 25, 2024, 2:03am

When is it started to happen?
Also, could you please try to list with uplink?

cdhowie · December 25, 2024, 2:11am

I started working on migrating the data two days ago. The more objects I added to the bucket, the slower this operation has been getting, so I’d say it started some time before I started using Storj.

I’m testing with the uplink now (uplink ls sj://...). In a few minutes there hasn’t been any output. I’ll let you know when the operation finishes.

cdhowie · December 25, 2024, 2:16am

rclone lsd has failed.

2024/12/25 02:12:31 ERROR : : error listing: RequestError: send request failed
caused by: Get "https://XXX.gateway.storjshare.io/?delimiter=%2F&encoding-type=url&list-type=2&max-keys=1000&prefix=XXX": read tcp XXX->136.0.77.2:443: read: connection reset by peer

cdhowie · December 25, 2024, 2:30am

uplink ls finished in 17 minutes and 20 seconds. Faster than the S3 gateway, but still way too slow for us to use in production, especially compared to Backblaze completing the same operation in 10 seconds.

Alexey · December 25, 2024, 5:16am

I shared this issue with the team.

The listing is slow because the prefixes are part of the object name, so they are encrypted too by default. And to list them either your libuplink (when you use a Storj native integration) or libuplink on the Gateway-MT (when you use an S3-compatible integration) will decrypt paths for every single object before return the list. This is impossible to cache, otherwise your information could be compromised (it would live unencrypted in the cache).

The distributed and encrypted nature of Storj is applying an overhead to usual listing operations, especially when you use unterminated prefixes.
I.e. if you would use a prefix like uplink ls sj://data/prefix vs uplink ls sj://data/prefix/ the last one will be significantly faster, due to the same problem - prefixes are encrypted because they are part of the object name - it’s an object’s key. Unterminated prefixes triggers an exhaustive listing, because you cannot know before decryption where it will end unlike the terminated prefixes, even in encrypted form they would have a delimiter and listing will be faster, because it may be grouped by encrypted prefix with a delimiter.
E.g.:

$ uplink ls --encrypted sj://test
KIND    CREATED                SIZE       KEY
PRE                                       AgUC-F7GG6SoFWbjinkrarA224lJ0YusREDFkLcaCKjTFSEOBHU=/
PRE                                       Agiz0yDTBh9ws6h8AQESJjppyhPAcpqiMTDQM-L0rmyq4ZZ-wTRaGUVXQdlgvqF-/
OBJ     2024-02-17 09:52:50    192        AgxzNWNKoTYDoL5Qs6pO_HX24rL-ArCC1e471hEpqZ7U0UXhr9ylTVsofAyTlfhtM55etjXIBJagWWQ=

So as you can see, the prefix with a delimiter (/) has a representation even in encrypted form of Agiz0yDTBh9ws6h8AQESJjppyhPAcpqiMTDQM-L0rmyq4ZZ-wTRaGUVXQdlgvqF-/

To speedup a listing process you may disable the paths encryption.
If you do not want to have encrypted paths (and to get a more s3-compatible listing behavior), then you may create an access grant with support of unencrypted paths:

uplink access create --unencrypted-object-keys --import-as unencrypted-paths

this will create a root access grant with a name “unencrypted-paths” using the API key, satellite address and provided encryption phrase.

To generate a derived access grant to use it in rclone, FileZilla, etc. with a native Storj integration:

uplink access restrict --readonly=false --access unencrypted-paths

it will print an access grant which wouldn’t encrypt/decrypt paths (the objects itself will remain encrypted anyway).

Then you may use this access grant to configure rclone or generate an S3 credentials:

uplink access register unencrypted-paths

this will print S3 credentials with the permissions of the used access grant, including do not encrypt/decrypt paths.

Please note, access grants with enabled paths encryption and with disabled paths encryption are mutually exclusive, you would get a digital noise using the access grant with an unencrypted paths on buckets/prefixes with the encrypted paths and vice versa.

Please also try to use a terminated prefixes where is possible.

cdhowie · December 25, 2024, 5:34am

Thanks for the info.

I am using terminated prefixes (listing “directories” with few sub-paths is indeed fast, but this directory contains the bulk of the 160k objects).

Unencrypted paths would indeed be fine for our use case (they don’t contain any sensitive information), so I will investigate using them instead – thanks for that pointer. I will let you know my results after some testing.

Alexey · December 25, 2024, 5:41am

Thank you!

I tested a difference between terminated and unterminated prefixes:

$ time uplink ls --access us1-demo sj://data/tmp | wc -l
2

real    11m19.325s
user    0m0.612s
sys     0m0.166s

vs

$ time uplink ls --access us1-demo sj://data/tmp/ | wc -l
2

real    0m0.846s
user    0m0.027s
sys     0m0.033s

So, it’s even more significant than I thought.
This bucket contains 2601 root prefixes and more than 26k objects.

cdhowie · December 25, 2024, 5:44am

That looks promising for sure. If I can replicate that level of performance then I will feel a lot better about moving this data to Storj.

Alexey · December 25, 2024, 6:36am

oof I was wrong, there are 265k objects

$ time uplink ls --access us1-demo --encrypted --recursive sj://data | wc -l
265589
real    2m12.631s
user    0m5.060s
sys     0m3.184s

cdhowie · December 25, 2024, 6:43am

Can you elaborate on what this means in the context of this discussion?

Alexey · December 25, 2024, 6:53am

The more objects in the bucket, then slower the list objects method in case of using of unterminated prefixes.
I’m testing, does the disabling paths encryption could help to speedup a process, or the real limitation are these unterminated prefixes.
Right now the team on Winter Break, so I’m on my own for a while, thus would try to collect as much information as possible.

I think we may introduced a delay in the objects listing in the latest change (it’s exactly related to speedup a listing for some edge cases).

cdhowie · December 25, 2024, 6:55am

Ah, I see. Hopefully the requests rclone makes during copy/sync use terminated prefixes.

Roxor · December 25, 2024, 10:11am

So Backblaze B2 doesn’t offer encrypted paths?

That’s one way to speed things up. Not a good way. But definately a way

cdhowie · December 25, 2024, 10:15am

I don’t know of any major S3-compatible vendor that does (other than Storj)… even S3 itself, to my knowledge, does not encrypt keys.

Neither S3 nor B2 require SSE either. You can store stuff entirely unencrypted, which is perfectly fine for certain kinds of data (e.g. already-encrypted repositories like restic backups).

cdhowie · December 25, 2024, 5:51pm

Using unencrypted paths, I was able to get the rclone copy to succeed but I ran it a second time (which should not copy any data, but ensure that the Storj destination bucket has all of the files in the source) and it timed out after nearly 2 hours.

2024/12/25 17:48:14 ERROR : XXX/XXX: error reading destination directory: RequestError: send request failed
caused by: Get "https://XXX.gateway.storjshare.io/?delimiter=%2F&encoding-type=url&list-type=2&max-keys=1000&prefix=XXX%2FXXX%2F": net/http: timeout awaiting response headers

Note that the prefix is /-terminated; nevertheless, the operation takes forever and times out.

It’s worth noting that the rclone copy operation ran into errors before because I interrupted it and restarted it when it was about 75% done. I assume if I’d interrupted this operation at around the same point it would suffer the same problem.

Alexey · December 26, 2024, 12:59am

Does it timeout for the native integration too?
Because I did sync of these 265,589 objects, but using the native integration and from the bucket with encrypted paths to the bucket with unencrypted paths, e.g.

$ rclone copy -P --transfers 100 us1-demo:data us1-demo-unencrypted-paths:unencrypted-paths
Transferred:           37 KiB / 37 KiB, 100%, 114 B/s, ETA 0s
Checks:            265551 / 265551, 100%
Transferred:           37 / 37, 100%
Elapsed time:     25m31.5s

The initial sync took longer, about 40 minutes. So, yes, comparing two lists is the most time consuming operation.
It didn’t use a server-side copy, because these remotes are different.

Seems the disabling paths encryption doesn’t speed up a process unfortunately. Likely libuplink doing the same work as with encrypted paths - decrypts objects to get paths and filter them after that. So, we still need to have an optimization there.

Could you please try to add --checkers 100 to your rclone sync/rclone copy command?

Also, what’s satellite?

Alexey · December 26, 2024, 1:12am

It’s not an S3 specification, it’s Storj’s feature, where even paths are encrypted.

cdhowie · December 26, 2024, 6:51am

US1 satellite.

--checkers 100 makes no difference. The operation failed due to timeout after 1 hour and 40 minutes.

Alexey · December 26, 2024, 8:03am

Thanks for the update!
I would say, that using an S3 is significantly slower than native.
I started a sync again, but with S3. Passed 4 hours and it’s still in a progress.
With a native it took 40m.
So, I would assume, that using a listing feature for S3 integration is a worst experience so far. I passed this info to the team as well.

We are working on it to improve the listing.