Testing of the storj/rclone fork

I’m interested in rclone working natively over the storj uplink. I tested the feature/storj branch on the storj fork of the rclone project.

Test setup:

  • Fresh Ubuntu 18.04 LXC container
  • Install Go: apt install golang. (Results in go version go1.10.4 linux/amd64)
  • Clone feature/storj branch into ~/go/src/github.com/rclone/rclone
  • Run make rclone.

This puts a compiled rclone binary in ~/go/bin/rclone.

Rclone setup. Run ~/go/bin/rclone config

  • New remote named tardigrade
  • 30 - Storj
  • scope> Grab the output of accesses.default from ~/.local/share/storj/uplink/config.yaml
  • skip-peer-ca-whitelist> Kept default: false
  • defaults> Kept default: release

Then I tested the transfer by Rclone syncing the git repo to Tardigrade US Central.

  • rclone mkdir tardigrade:rclone-uplink
  • `~/go/bin/rclone sync -P -L ~/go/src/github.com/rclone/rclone tardigrade:rclone-uplink

Results:

  • Sync average rate: 89.281 kBytes/s. 509MB transfer took 1h37m15.5s.
  • Many errors with the message: Failed to copy: metainfo error: Too Many Requests

Is there any way to tune rclone with this storj backend to improve the performance and reliability? I’m happy to test and report back on my findings.

Looking forward to input from the community. Thanks!

I realize that the git branch has a ton of small files, all of which need to be split and transmitted. On the opposite end of the spectrum, I tested the upload of the debian stable netinst iso (335M).

Results:

  • Sync average rate: 5.891 MBytes/s
  • No errors

My next test will be using this compiled rclone binary as the backend for my restic backup that’s already stored in Tardigrade, originally uploaded using restic over rclone over Storj S3 gateway.

I’ll update this thread with results of that test.

1 Like

Hello @fmoledina. I work directly with @calebcase who is building and maintaining the rclone fork. I would be great if you could tell us more about what you want to do with the rclone integration.

Currently we have a requests rate limit that is quite low. That means if you have lot of small files (like a git repo) you are going to hit rate limit very quickly “Too Many Requests” vs. 500 MB files we have seen people transfer 20 Gbps.

We already have a fix that will be deployed in the next few days that will drastically increase the rate limit. I suggest you run the test again mid next week, and let us know the results.

1 Like

Hello @super3. My main goal for rclone integration is for backups with restic. I’ve already tested uploading ~900GB of data using restic over rclone over S3 using the Storj S3 gateway. I’m now going to try out restic over rclone with the native Storj backend and see how that goes with respect to functionality and reliability. I’ll keep monitoring for updates on the rclone fork. Thanks.

1 Like

[quote=“fmoledina, post:4, topic:5051”]
Hello @super3. My main goal for rclone integration is for backups with restic . I’ve already tested uploading ~900GB of data using restic over rclone over S3 using the Storj S3 gateway. I’m now going to try out restic over rclone with the native Storj backend and see how that goes with respect to functionality and reliability. I’ll keep monitoring for updates on the rclone fork. Thanks.
[/quote]That is really cool. Keep us updated!

Has the rate limit fix been applied yet out of interest?

The upload has been in progress for a couple days, and the progress is around 63%. See the below restic output:

[60:44:27] 62.95%  677166 files 750.839 GiB, total 919588 files 1.165 TiB, 0 errors ETA 35:44:53

From my rudimentary understanding, restic will create the 4-5MB block files from the original dataset, and then push those up to Storj via the rclone uplink connector. Here’s a sample output from that process:

rclone: 2020/03/09 10:47:10 INFO  : 783 GiB, total 919588 files 1.165 TiB, 0 errors ETA 35:45:02
rclone: Transferred:   e  394.640G / 394.663 GBytes, 100%, 1.958 MBytes/s, ETA 12s
rclone: Transferred:        85055 / 85060, 100%Roll/VID_20191031_180310.mp4
rclone: Elapsed time:  57h19m32.7s
rclone: Transferring:
rclone:  * data/16/1678e6995dde3b…6b4c895c4396e21bd8ffec:  0% /4.298M, 31.995k/s, 2m16s
rclone:  * data/1c/1ca10817ccd59b…ee734bee390134f9b2aa34:  0% /4.948M, 15.997k/s, 5m14s
rclone:  * data/22/22cb1791728640…dc71c4b9a510f87f39fa08:  0% /5.223M, 0/s, -
rclone:  * data/4f/4febd2038f7ce6…88e49ae71d7b82fa35ba4a:  0% /5.277M, 31.991k/s, 2m47s
rclone:  * data/7f/7f26f1060c87ed…4cf3ce35cdba6412742689:  0% /4.654M, 31.994k/s, 2m27s
rclone:
rclone: 2020/03/09 10:47:19 INFO  : FS sj://restic: cp input ./data/53/531b75da1004aa8a20c7f648848f79dcb73f34691bff592b15df6bfe3f0f8f43 # [] 5573774
rclone: 2020/03/09 10:47:19 INFO  : FS sj://restic: cp input ./data/43/434f09e8377fdeb4e0cc9b3318601c316371bab9e0b23b05c41354bb5b997f7f # [] 5045999
rclone: 2020/03/09 10:47:19 INFO  : FS sj://restic: cp input ./data/29/29385510f60940f67e1414b81bad2cf2af02a815aa95cd3490825d627d20a6b1 # [] 8971375
rclone: 2020/03/09 10:47:19 INFO  : FS sj://restic: cp input ./data/5c/5c90d4b9cfbd81f9a3b4b6431733ceb097100bd0068bb9a15ae7c3e1eb6ba68d # [] 5259298
rclone: 2020/03/09 10:47:19 INFO  : FS sj://restic: cp input ./data/61/613ecfc48d8f17857ca969b8d859ce9462823d51761998a10ff2a9db472665e8 # [] 4850564
rclone: 2020/03/09 10:47:32 INFO  : FS sj://restic: cp input ./data/28/2803e3449229f83d3c237333f69c3eb2f93838a056ce817cc98c714305913827 # [] 6642878
rclone: 2020/03/09 10:47:32 INFO  : FS sj://restic: cp input ./data/91/91529eb2c10bd5a846d5e89b2e5b3efc7a84683b1f97c864675562f37766a1ca # [] 6140263
rclone: 2020/03/09 10:47:32 INFO  : FS sj://restic: cp input ./data/be/be2618614b069e639bbdbf1012549b13ef72269585e0ffe51f28f7cef82f1d84 # [] 5264634
rclone: 2020/03/09 10:47:32 INFO  : FS sj://restic: cp input ./data/47/470c5fd5946eb4e968a807d56bccacfde7b9249f6deb0ebb15c074a5d0c3e93e # [] 5003058
rclone: 2020/03/09 10:47:33 INFO  : FS sj://restic: cp input ./data/29/294e121c5a9c583bd0f542321de3ccda52a1b0b46b2b2ca2dc00d74f346c0aa9 # [] 4917522

The overall backup is averaging around 2MB/s effective upload speed, which is a bit slower than I would like but I may end up relegating the Tardigrade backup to be just my ‘most important’ files as a complement to my other off-site backup methods, which would reduce the data payload considerably.

I’m not familiar with restic, but does it allow you to change the block size so that files uploaded are closer to the 64MB segment size tardigrade uses? I think that would get you a lot better performance.

I used restic with s3 gateway and it uploaded 60gb in couple of hours

I can corroborate this. My first test was restic with the S3 gateway and I was averaging 9-10MB/s. I had a bunch of too many requests errors on the S3 gateway BUT restic pushed through and retried uploads as required to result in an error free upload at the end. I originally uploaded this entire ~900GB test dataset using that method.

I’m now testing to see if restic over an rclone-S3 or rclone-native backend results in better results, either via speed or reliability improvements. The Storj native rclone backend is certainly more reliable with restic (i.e. no too many requests errors), but it’s definitely slower.

The rate limit has been slowly increased over the last 5 days. It is now ~10x what it used to be.

2 Likes

The delete rate limit probably still needs a fix on the satellites I guess.

So my backup just finished. For posterity, here’s the final output:

Files:       919588 new,     0 changed,     0 unmodified
Dirs:            2 new,     0 changed,     0 unmodified
Added to the repo: 594.057 GiB

processed 919588 files, 1.165 TiB in 90:28:53
snapshot c439b3df saved
rclone: 2020/03/10 16:32:05 INFO  : FS sj://restic: stat ./locks/300ce0b37081e0b23cc212874e6acbe796d830475ce3e37100f676a39a1feaa3
rclone: 2020/03/10 16:32:05 INFO  : locks/300ce0b37081e0b23cc212874e6acbe796d830475ce3e37100f676a39a1feaa3: rm sj://restic/locks/300ce0b37081e0b23cc212874e6acbe796d830475ce3e37100f676a39a1feaa3
LOG (2020-03-10 16:32:05): backup FINISHED.  Duration: 90h28m56s

Looks like the storj/rclone fork works decently well, although slower at this time compared to S3 with this restic dataset. I’d be happy to continue testing as this rclone fork is updated.

2 Likes

This is really great feedback @fmoledina. There will be an update to rclone soonish as it is being migrated to the new libuplink release candidate. This will be an excellent baseline to compare against.

1 Like

@calebcase, that’s exciting! I’ll look out for it and do the full dataset upload test again with that.

I just got around to checking out the latest commit. I’ve built it, this time using backported go-1.14 since my installed go-1.10 was too old for one of the requirements.

On my system, Ubuntu 18.04.4 LTS Server:

sudo add-apt-repository ppa:longsleep/golang-backports
sudo apt update
sudo apt install golang-go

I’ve run make rclone on the new commit and got a new rclone binary. I’m now working to delete all the previously uploaded data and will do a fresh test with a new restic repo using the native rclone-storj backend.

I’ll update this thread with my findings.

3 Likes

Didn’t get a chance to post results from this test until now. Final results are as follows:

Files:       700741 new,     0 changed,     0 unmodified
Dirs:            2 new,     0 changed,     0 unmodified
Added to the repo: 557.870 GiB

processed 700741 files, 751.880 GiB in 44:43:01
snapshot a9b03012 saved
rclone: 2020/03/20 11:25:35 INFO  : FS sj://restic: stat ./locks/5f926b3edc5f2f0bf11e0b5dd23662e156ce26266d2a96e7f3e3c7eae9be8d4e
rclone: 2020/03/20 11:25:35 INFO  : locks/5f926b3edc5f2f0bf11e0b5dd23662e156ce26266d2a96e7f3e3c7eae9be8d4e: rm sj://restic/locks/5f926b3edc5f2f0bf11e0b5dd23662e156ce26266d2a96e7f3e3c7eae9be8d4e
LOG (2020-03-20 11:25:35): backup FINISHED.  Duration: 44h43m4s

Looking at the actual uploaded amount (594GB vs 557GB), the newer version of the storj/rclone fork was averaging approx 3.5 MB/s with no change to reliability. It could be faster, but this is a definite improvement. Hopefully the storj/rclone fork gets mainlined soon! Thanks.

2 Likes

Its being worked on as we speak.

Rclone is now working for me with some workarounds. A new version was pushed on the weekend. The commit was amended that is why the change is hard to see on github.

At the moment we have 2 bugs with rclone / libuplink.

  1. rclone is hitting the rate limit while syncing a local folder with a remote folder. Each rate limit error will let rclone overwrite an existing remote file. Workaround --checkers 1
  2. rcloen is killing my router with too many DNS requests. Workaround --transfers 1

Plus a user error on my end. QNAP Qsync cuts off part of the mod timestamp. Workaround --modify-window 2s and I had to change the settings of Qsync because it has a default ignore list and was not keeping all of my folders in sync.

Now I have an encrypted remote backup :slight_smile:

1 Like

That is great to hear. We are working on including rclone in the QNAP app we are looking on. Have you tried it?