Relationship between S3 operations and uplink

My question is rather basic and primitive.

I initially uploaded a snapshot of a 30TB data set with uplink. However, the source data set changes over time. It appears that uplink cp uploads a fresh copy every time, regardless of whether the data object has changed.

Ergo, I switched to S3 Access credentials and rclone ( the binary with StorJ support, from the StorJ home pages) - version 1.66.

My question - in terms of retrieval (or onboarding), does it make a difference whether I use uplink or an S3 client?

  • John “S”

On a side note, rclone, when used to download and with the “sync” option is now crashing for an out of memory condition - every single time now - even for a --dry-run as soon as it begins indexing files that have not been down loaded and with a --parallel of “1”.

Hey John!

uplink cp just unconditionally copies as you noticed, and doesn’t attempt any kind of determination of if the source is already at the destination. Using rclone does give you this behavior (both rclone cp and rclone sync attempt to determine if the destination already exists).

I believe (at least at one point) rclone had support for both native direct Storj network communication and also S3 protocol support (it may be listed under Tardigrade, an old brand of ours). I’m not sure which exists in rclone today.

If you use native communication (uplinkcli, for instance, or rclone’s native Storj mode), then you are talking directly to the storage nodes. Your uploads can be much faster in the sense that you are eliminating a hop, but it can require some tuning to get there. This is also the right way to get end-to-end encryption. On the other hand, native uploads have an expansion factor - every 1 TB uploaded this way will cost you ~2.7 TB of egress.

If you use the S3 protocol route, then while you now get S3-protocol-standard server-side encryption, and your client is no longer talking directly to storage nodes, your egress use will scale back to the 1TB you’re uploading, since the S3 gateway is now the origination point of that expansion factor.

We probably have a little table or somewhere comparing the pros and cons of the two, but in summary:

Native/uplink: end-to-end encryption, direct node communication (less hops), 2.7x egress for uploads, perhaps a need for some use-case-specific tuning.
Gateway/S3: server-side encryption, 1x egress for uploads, using the Storj-tuned gateway.

Hope this helps and that you are well!

3 Likes

I’m quite sure that I’m using the rclone native StorJ support, when setting up a new end point with rclone config, it’s something like option 44. The default Ubuntu 22.04 LTS version of rclone does not support StorJ directly, but upstream version 1.66 does.

If I upload with uplink, can I download with S3 (and vice versa)?

Yes you can! As long as the access grant you use with uplink and the access key you use with S3 have been constructed with the same passphrase, it should work great.

3 Likes

I’ll just add some notes here. I’ve been using StorJ to distribute multi-terabyte copies of full blockchain copies to and from various continents.

uplink works absolutely wonderfully and flawlessly IF one waits till the complete data set is on the StorJ network as it is unaware of changes to data sets and blissfully overwrites objects and files that are fully data complete and is unable to effect some form of “delta/diff” transfer. Ergo, uplink is most definitely the way to go if you have a static, relatively immutable data set AND you can wait till the data set is fully uploaded before beginning an ‘uplink’ based download.

Unfortunately, I did not wait till the upload was complete before starting downloading it to various endpoints. If using ‘uplink sync’ this means that fully complete data files are overwritten, regardless of their state.

I therefore transitioned to using rclone, but with native StorJ and generic S3 configs. The objective was to regularly refresh the downloads destinations as the upstream StorJ bucket was added to.

Unfortunately, under every scenario, rclone has a very nasty OOM bug when there are LOTS of files. In my case, the source file count is > 65,000.

The rclone bug appears in every release repo of rclone I’ve tested (~ version 1.6) as well as the latest upstream v1.66. It also appears regardless of the --dry-run flag being present. The mitigations suggest from 2018 of adding the --attr-timeout of >= seconds appears to be no longer supported.

I then pivoted to using Minio’s ‘mc’ client to pull from StorJ. This appears to mostly work fine for both full runs and --dry-run with regards to memory and OOM issues. However, when pulling from a StorJ S3 bucket, I always end up with an unexpected EOF error on an object stream:

Below is typical of the random, but always present failures to mc mirror my StorJ archive. Note I had to paste a screen shot as it was interpreted as containing too many (more than 2 links).

1 Like

I see, you may reduce the rclone memory footprint to reduce a parallelism, but it likely will affect the speed.
Hm. You may try to use xargs command to run rclone in parallel, but with a reduced number of threads/parallelism for the each instance, because right now we do not have an option in the uplink CLI to check the destination, it will always upload, independently of does it changed or not…
Hm… It wouldn’t work too I guess, if the source is exact for the each instance…
I passed your question to the team.

The only thing which I thinks about is to use some backup tools instead, like Duplicacy, restic or HashBackup, they can do a hashed snapshots (only a difference) and reduce your storage costs (because of a bigger packs), this also should increase the speed for recovering.

1 Like

Adding some more notes…

The rclone OOM issue appears to plague high core count, EPYC processors the most. I’ve a bunch of older (Xeon gen 1 & 2) boxes with 8 cores or less, and the problem does not occur.

Doing a web search for “rclone out of memory oom bug” yields a ton of similar results going back to 2018, most relevantly:

But again recently documented here:

What has worked extremely well is MinIO’s ‘mc’.

It would be great if uplink supported delta sync/mirror operations natively!

Rclone should work great for this. I’m happy to do a call if you want to run a lab @Suykerbuyk

It sounds like your process might benefit from the use of rclone sync vs rclone copy.

Example command
rclone sync --progress --checkers 100 --fast-list --disable-http2 --transfers 64 --dry-run /local/path mount:bucket

--progress real-time transfer statistics
--checkers default is 8, scale up to improve checking throughput
--fast-list can improve listing speed by being recursive, drop this if you have memory issues
--disable-http2 a must as it improves performance substantially
--transfers 64 files transferred in parallel, memory usage will be this number multiplied by file size up to 64mb (64x64=4096mb). Normally 64 is enough to be “really fast” when moving a bunch of smaller files but if you have the resources you can try 96 and 128. If the file being bulk uploaded (synced) is large don’t use such high transfers as default --s3-upload-concurrency is 4 (64x64)
--dry-run for testing, does not make any changes

Native vs Hosted S3
As for native vs hosted S3. Rclone supports both options. My advice is to use hosted s3 to start and experiment with native after you are successful. Load will be a lot higher native.

Hosted S3
As of version 1.61.1

5 / Amazon S3 Compliant Storage Providers including AWS, Alibaba, Ceph, China Mobile, Cloudflare, ArvanCloud, DigitalOcean, Dreamhost, Huawei OBS, IBM COS, IDrive e2, IONOS Cloud, Liara, Lyve Cloud, Minio, Netease, RackCorp, Scaleway, SeaweedFS, StackPath, Storj, Tencent COS, Qiniu and Wasabi

Then…

21 / Storj (S3 Compatible Gateway)
\ (Storj)

Native
As of version 1.61.1

41 / Storj Decentralized Cloud Storage
\ (storj)

3 Likes

Thanks, Dominick!

I agree. I tried multiple permutations of sync. More often than not in the past, the OOM was “solved” with an --attr-timout flag of 60 seconds or more. However, this seems to be missing in all the v.1.6 versions I’ve tested.

I had not played with the flags of: --fast-list and --checkers. I suspect that will help and will resume testing again this afternoon.

3 Likes