Rasp Pi 4 - Existing HDD transfer to new HDD on same device (Migrate Rsync)

Alexey · August 26, 2020, 6:15am

I hope you didn’t start the clone, but started an original. Also, why you restarted it?

dragonhogan · August 26, 2020, 10:22am

That’s correct. I started the node with the original HDD.

I guess for clarification, I restarted in two ways. As I stated, after the initial rsync pass was completed. I vacuumed my dbs as it had been about a month since I had done that, which leaves the node stopped (in my script). Then I ran update on RPi, and it’s just an old habit of mine to restart the RPi after I run upgrade to give me piece of mind that it was successful. Then, like I said I restarted the storj docker container on the original HDD, and then started the rsync process for the second pass.

Alexey · August 26, 2020, 6:25pm

Do you know, that rsync will transfer again those databases?

dragonhogan · August 26, 2020, 6:40pm

Is that an issue that I did a vacuum of dbs between rsync pass 1 and 2? I did see that the dbs were transferred again…doesn’t the rsync process just look for new/changed files between the original and clone after the first pass, and write/re-write those files?

Alexey · August 26, 2020, 6:56pm

Yes, it will. So, you are aware that your second run will be longer because of that, good.
I would like suggest you to shaping before the first run or after the finished migration on a new device, but not between the process

dragonhogan · August 26, 2020, 8:19pm

I guess I don’t understand why it would take longer on this second pass due to the db vacuuming. In fact, it does appear to be going quicker. It’s already about 25% done after 24 hours. The first pass took 11 days…

Alexey · August 26, 2020, 8:56pm

This is good to know, thank you!

dragonhogan · August 27, 2020, 1:57pm

Quick question for you, after reading the 1.11 change log, any concern with the changes to the orders DB coming in the update while running the rsync process? I’m thinking it may serve me well to turnoff watchtower so that I can complete the rsync process prior to updating the node to the V1.11.

Although, unless you think that this shouldn’t be an issue, then I won’t do that.

Alexey · August 27, 2020, 6:19pm

It could be longer a third run. Maybe it’s worth to remove the watchtower for a while. However, the docker image should be available not earlier than after a week.

dragonhogan · August 28, 2020, 12:50pm

Appreciate the response. I decided to go ahead and stop watchtower for the time being.

Also, FYI the second pass of rsync finished in a little less than 3 days, so it doesn’t appear that vacuuming the dbs between pass 1 and 2 impacted the rsync process much if at all.

dragonhogan · August 30, 2020, 4:41pm

Last update:

Last “normal” rsync pass (#4) completed late last night and it completed in ~12 hours, so I figured that was probably good to go ahead and start the final “delete” rsync pass with the docker container stopped and removed. That just finished up this morning, and I just started up the docker container with the updated identity and storage location paths. Node started right up and is off and running.

Process worked great, and it took about 2 weeks to complete for my node that has about 8.8TB of data. As stated before, it was a little longer than I had originally expected, but it worked. Storj documentation was spot on and super helpful.

node1 · September 22, 2020, 12:04pm

I had a few rsyncs from RPi 4, but both of them where RPi (source) → PC (destination) over the internet. And this configuration was much faster, then RPi (source) and same RPi (destination).

Migrate 2TB of data, took more then 12hours to run it first time, while 2nd and 3th time - the same time in this case. It’s not a problem doing it first or 2nd time. But for the last time, have to stopp the node.

Interesting how this long stop (for the last sync) will affect reputation of the node?

Probably next time if i have to migrate the node to another disk i will not use RPi. I believe it’s much faster to connect the drives (for the last sync) to the more powerfull device then RPi.

Another thought. I’m using USB3–>USB3 on RPi4. Maybe it would be faster to use USB2–>USB3 or USB3–>USB2, maybe it runs on different controllers?

Pac · September 23, 2020, 7:50am

I’m not sure to get why this stop should be for a long period of time?
I never had to stop a node for a long time in my rsync migrations.
The first rsync may take 12 hours, but the second would take 1 hour, then the third something like 5 minutes, then you stop the node, run a final rsync that takes less than a minute and your node is ready to be restarted on its new destination device… ?

node1 · September 23, 2020, 7:55am

The situation you described i had copying RPi → PC. Then each rsync was faster. But last time i’ve done rsync inside RPi from one drive to other. And rsync was faster only by half. First took ±12 hours, last took ±5,5hours (with 0 files tranfered). I have no idea, why, but this was the first time i’ve done rsync inside RPi and it was extremely slow.

Alexey · September 23, 2020, 7:58am

Because the rpi uses a one USB channel for everything, HDD, Ethernet, devices…

node1 · September 23, 2020, 8:26am

I knew this from RPi3, but haven’t yet analized RPi4 from this perspective. But this rsync shows that RPi4 also deals the same way as RPi3

Pac · September 23, 2020, 12:41pm

Aaah right, okay I see.
Well when there are millions of files, listing them all is horrendous on some disks.

Listing all file (1.4 millions of files) on my 2.5" 2TB disk (with ncdu) takes 1 hour and 36 minutes:

pi@raspberrypi:/.../storj/mounts/disk_1 $ time ncdu -o ~/mount1-listing.txt
/.../storj/mounts/disk_1/...orage/temp/blob-313017082.partial  1416409 files

real    96m44.512s
user    0m20.184s
sys     3m55.404s

=> 244 files/sec.

But after the first run, rsync does things way faster (firstly because there’s way less data to copy, but it obviously does not struggle that much for listing files again).
Maybe there’s enough RAM to keep all metadata in memory when rsyncing from the RPi to another external machine, that would explain why it is faster when running subsequent rsync.
And maybe there’s not enough RAM when it has to keep track of twice as much metadata when both disks are on the same RPi…

Just my two cents, but I feel like it might not be it

LinuxNet · October 23, 2020, 6:48pm

Just for info:
After 6 days and 10 hours I finally copied 7TB from a USB 3.0 to a bigger USB 3.0 hard drive.