I realize that the git branch has a ton of small files, all of which need to be split and transmitted. On the opposite end of the spectrum, I tested the upload of the debian stable netinst iso (335M).
Results:
Sync average rate: 5.891 MBytes/s
No errors
My next test will be using this compiled rclone binary as the backend for my restic backup that’s already stored in Tardigrade, originally uploaded using restic over rclone over Storj S3 gateway.
I’ll update this thread with results of that test.
Hello @fmoledina. I work directly with @calebcase who is building and maintaining the rclone fork. I would be great if you could tell us more about what you want to do with the rclone integration.
Currently we have a requests rate limit that is quite low. That means if you have lot of small files (like a git repo) you are going to hit rate limit very quickly “Too Many Requests” vs. 500 MB files we have seen people transfer 20 Gbps.
We already have a fix that will be deployed in the next few days that will drastically increase the rate limit. I suggest you run the test again mid next week, and let us know the results.
Hello @super3. My main goal for rclone integration is for backups with restic. I’ve already tested uploading ~900GB of data using restic over rclone over S3 using the Storj S3 gateway. I’m now going to try out restic over rclone with the native Storj backend and see how that goes with respect to functionality and reliability. I’ll keep monitoring for updates on the rclone fork. Thanks.
[quote=“fmoledina, post:4, topic:5051”]
Hello @super3. My main goal for rclone integration is for backups with restic . I’ve already tested uploading ~900GB of data using restic over rclone over S3 using the Storj S3 gateway. I’m now going to try out restic over rclone with the native Storj backend and see how that goes with respect to functionality and reliability. I’ll keep monitoring for updates on the rclone fork. Thanks.
[/quote]That is really cool. Keep us updated!
The upload has been in progress for a couple days, and the progress is around 63%. See the below restic output:
[60:44:27] 62.95% 677166 files 750.839 GiB, total 919588 files 1.165 TiB, 0 errors ETA 35:44:53
From my rudimentary understanding, restic will create the 4-5MB block files from the original dataset, and then push those up to Storj via the rclone uplink connector. Here’s a sample output from that process:
The overall backup is averaging around 2MB/s effective upload speed, which is a bit slower than I would like but I may end up relegating the Tardigrade backup to be just my ‘most important’ files as a complement to my other off-site backup methods, which would reduce the data payload considerably.
I’m not familiar with restic, but does it allow you to change the block size so that files uploaded are closer to the 64MB segment size tardigrade uses? I think that would get you a lot better performance.
I can corroborate this. My first test was restic with the S3 gateway and I was averaging 9-10MB/s. I had a bunch of too many requests errors on the S3 gateway BUT restic pushed through and retried uploads as required to result in an error free upload at the end. I originally uploaded this entire ~900GB test dataset using that method.
I’m now testing to see if restic over an rclone-S3 or rclone-native backend results in better results, either via speed or reliability improvements. The Storj native rclone backend is certainly more reliable with restic (i.e. no too many requests errors), but it’s definitely slower.
So my backup just finished. For posterity, here’s the final output:
Files: 919588 new, 0 changed, 0 unmodified
Dirs: 2 new, 0 changed, 0 unmodified
Added to the repo: 594.057 GiB
processed 919588 files, 1.165 TiB in 90:28:53
snapshot c439b3df saved
rclone: 2020/03/10 16:32:05 INFO : FS sj://restic: stat ./locks/300ce0b37081e0b23cc212874e6acbe796d830475ce3e37100f676a39a1feaa3
rclone: 2020/03/10 16:32:05 INFO : locks/300ce0b37081e0b23cc212874e6acbe796d830475ce3e37100f676a39a1feaa3: rm sj://restic/locks/300ce0b37081e0b23cc212874e6acbe796d830475ce3e37100f676a39a1feaa3
LOG (2020-03-10 16:32:05): backup FINISHED. Duration: 90h28m56s
Looks like the storj/rclone fork works decently well, although slower at this time compared to S3 with this restic dataset. I’d be happy to continue testing as this rclone fork is updated.
This is really great feedback @fmoledina. There will be an update to rclone soonish as it is being migrated to the new libuplink release candidate. This will be an excellent baseline to compare against.
I just got around to checking out the latest commit. I’ve built it, this time using backported go-1.14 since my installed go-1.10 was too old for one of the requirements.
I’ve run make rclone on the new commit and got a new rclone binary. I’m now working to delete all the previously uploaded data and will do a fresh test with a new restic repo using the native rclone-storj backend.
Didn’t get a chance to post results from this test until now. Final results are as follows:
Files: 700741 new, 0 changed, 0 unmodified
Dirs: 2 new, 0 changed, 0 unmodified
Added to the repo: 557.870 GiB
processed 700741 files, 751.880 GiB in 44:43:01
snapshot a9b03012 saved
rclone: 2020/03/20 11:25:35 INFO : FS sj://restic: stat ./locks/5f926b3edc5f2f0bf11e0b5dd23662e156ce26266d2a96e7f3e3c7eae9be8d4e
rclone: 2020/03/20 11:25:35 INFO : locks/5f926b3edc5f2f0bf11e0b5dd23662e156ce26266d2a96e7f3e3c7eae9be8d4e: rm sj://restic/locks/5f926b3edc5f2f0bf11e0b5dd23662e156ce26266d2a96e7f3e3c7eae9be8d4e
LOG (2020-03-20 11:25:35): backup FINISHED. Duration: 44h43m4s
Looking at the actual uploaded amount (594GB vs 557GB), the newer version of the storj/rclone fork was averaging approx 3.5 MB/s with no change to reliability. It could be faster, but this is a definite improvement. Hopefully the storj/rclone fork gets mainlined soon! Thanks.
Rclone is now working for me with some workarounds. A new version was pushed on the weekend. The commit was amended that is why the change is hard to see on github.
At the moment we have 2 bugs with rclone / libuplink.
rclone is hitting the rate limit while syncing a local folder with a remote folder. Each rate limit error will let rclone overwrite an existing remote file. Workaround --checkers 1
rcloen is killing my router with too many DNS requests. Workaround --transfers 1
Plus a user error on my end. QNAP Qsync cuts off part of the mod timestamp. Workaround --modify-window 2s and I had to change the settings of Qsync because it has a default ignore list and was not keeping all of my folders in sync.