Rasp Pi 4 - Existing HDD transfer to new HDD on same device (Migrate Rsync)

dbjson · April 6, 2020, 1:21pm

Afternoon all, I have a question regarding the Migrate process, then I have some documentation I will add in a later post if I feel it’s needed to help further explain my issue depending on the response(s).

I have gone through the process of Migration using Rsync. I believe I have done the process correctly however when starting the node from the new Hard drive (with existing 600GB) of data, when I run the Dashboard command the Disk usage is 0GB however seems to build quickly.

I’m concerned as I can’t see anywhere if:

Will the dashboard continue to grow as it is gradually calculating the existing data and hopefully get to the 600GB spot
Or should the Dash show the existing 600GB straight away?
I’ve tried 3 attempts now and gone back to running the node on the existing HDD to be safe for now as I’m concerned I have done it wrong and the existing data should show and I’d be disqualified within hours as it states on the page.

Could anyone help please?

Thanks!

deathlessdd · April 6, 2020, 1:35pm

I would triple check to make sure you have all the databases are all the same, and have the correct path, the amount of data should show instantly it starts up. As if it was before the move. Take that node offline and run it not connected to the internet to make sure before leaving it running.

kollo · April 6, 2020, 1:40pm

Are you sure this “building up” isn’t from fresh data? Meaning that the disk usage isn’t just a new data that is downloaded by a node.
The dashboard should show usage straightaway (I have moved data like 15 times or more) and stats was always there after starting the dashboard. Did you rm old container and created new with a new path to old data?
Does a path to identity is correct?
Also make sure that there are correct permissions.

dbjson · April 6, 2020, 5:26pm

Thanks for the replies, I’ve gone through the steps again meticulously and I’m at a loss.

Only part I believe may be the issue is I have ‘/media/pi/storjnew/storage’ which then contains another folder ‘storage’ so ‘/media/pi/storjnew/storage/storage’. I may just delete the ‘storjnew’ folder and start again as it only took 7 or so hrs if it is the issue.

Each node is on it’s own HDD as I’m migrating from an old HDD to new one.

Old hard drive I’m replacing
HDD1: ‘storj’
Dir: ‘/media/pi/storj’
Disk used by Storj: 600GB

New hard drive I have attempted to migrate to
HDD2: ‘storjnew’
Dir: ‘/media/pi/storjnew’
2TB HDD 1.8TB available.
After the transfer I can see it has gone from 1.8TB to 1.2TB which seems right.

You can see the structure of the folders below along with space avl and used on the new directory.

These are the commands I used throughout the process. I have a funny feeling though I made a mistake on this part below the first attempt and that’s why I have 2x storage folders in the new directory.

Source: https://documentation.storj.io/resources/faq/migrate-my-node
" Please, note - we intentionally specified /mnt/storagenode-new as the data source in the --mount parameter and not /mnt/storagenode-new/storage because the storagenode docker container will add a subfolder called /storage to the path automatically. So please, make sure that your data folder contains a storage subfolder with all the data inside ( blobs folder, database files, etc.), otherwise the node will start from scratch since it can’t find the data in the right subfolder and will be disqualified in a few hours."

// Copy the identity
sudo rsync -aP /media/pi/storj/identity/storagenode /media/pi/storjnew/identity

// Copy the data
sudo rsync -aP /media/pi/storj/storage /media/pi/storjnew/storage

// Repeat above running the data copying command (step 4.) a few more times until the difference would be negligible, then

// Once done with above stop the node
sudo docker stop -t 300 storagenode

// Remove the existing container
sudo docker rm storagenode

// Run the copying command with a --delete parameter to remove deleted files from the destination
sudo rsync -aP --delete /media/pi/storj/storage /media/pi/storjnew/storage

Then I run the new command below.

I also ran the below command to check there are files there. I’m a bit of a n00b with Rasp Pi / Linux so hoping I have made a simple mistake.

deathlessdd · April 6, 2020, 5:40pm

First glance I see some things outside of the storage folder that shouldnt be, all the database files should be inside of the storage folder, Are those the ones that got created cause you set the wrong path? You have a set of 2 storage folders. Did you move this from a windows PC? I dont know which is the right storagefolder you can try to change the path to the second storage folder to see if that is the right one.

heunland · April 6, 2020, 6:47pm

Please make sure that your disk is statically mounted via /etc/fstab, if not - please fix it ASAP

dbjson · April 6, 2020, 7:14pm

I’m using the same Rasp pi device. Just migrating from old hdd to newer one.

I have tried changing the Storage Node docker command with /storage/storage previously and no luck.

I’m going to start it from scratch as I think the dupe folders is the issue really. Thanks for your time.

dbjson · April 6, 2020, 7:14pm

I forgot about this as well! Cheers!

Alexey · April 6, 2020, 8:14pm

Do you have second storage in the /media/pi/storj/storage too?

Also, try this:

sudo rsync -aP --delete /media/pi/storj/storage/ /media/pi/storjnew/storage/

The trailing / is matter

dbjson · April 6, 2020, 8:32pm

Thanks Alexey, I removed the new folder (deleted) and mounted the device as heunland rightly said. Transferring the files now to the correct one storage folder on the new hdd.

Then I will run what you have sent. I’ll let you know how it goes.

Edit: sorry forgot to say, no additional folder in the existing directory as you asked.

BrightSilence · April 7, 2020, 7:58am

Edit: Please ignore this post.
It didn’t make sense.

dbjson · April 7, 2020, 9:46am

Thanks for your time everyone.

So I deleted the /storage/storage folder and then amended the copy data line and took off /storage and created the folder manually to make sure. Worked on that front.

Then the amendment to the —del command you mentioned Alexey.

Up and already downloaded 22GB in the first 2hrs

Existing data 600GB shown as well.

dragonhogan · April 15, 2020, 7:10pm

Is this the preferred method for a complete transfer i.e. instead of just stopping the whole node, plugging in the new drive and doing a manual copy of the data to the new drive?

I ask this as I am planning on replacing my 10TB (NTFS formatted) drive with a new 12TB drive that I’m going to format to ext4 before the copy over. I originally was going to stop the node, move the data off of the 10TB drive to another drive temporarily, reformat the 10TB drive to ext4, and then put all of the data back on it. The unfortunate thing is that my 10TB drive has so much data on it now that I don’t have a big enough spare HDD to hold it all temporarily. So I just figured that I’d invest in a new (bigger) HDD, format that to ext4, and then only have to do the data migration once since it’s going to take a really long time to move the 5-6TB of data…plus then I’d have a spare 10TB drive for another use or to save for a second node if/when the time comes…

So long story short, since the downtime suspension isn’t in place yet, would it be more reliable to just take the node offline and do a manual copy, or should I look to use the rsync process while keeping the node online?

Alexey · April 15, 2020, 8:12pm

Try migrate your data with this guide: https://documentation.storj.io/resources/faq/migrate-my-node

dragonhogan · August 19, 2020, 12:53am

Following the guide that Alexey has posted on a RPi 4. Holy cow, I don’t think I ever realized just how LONG the migration can take…current 10TB HDD (ntfs format) has about 8.7TB of data, and I am moving to a 12TB HDD (ext4). Going into day 5 of the first pass of the rsync process running and about 4.1TB of data has been migrated thus far.

Although, I guess I must be grateful for the rsync process allows the node to stay online while rsync does it’s thing. PLUS I bet the currently high download rates are limiting the read speed of the original drive.

Anyone else who has gone through the migration process, feel free to share your experience.

dbjson · August 21, 2020, 2:12pm

Did you manage to complete the process?

dragonhogan · August 21, 2020, 2:39pm

Ha…it’s still running the first pass of rsync…about 5.8TB transferred of the 8.7TB thus far. I think part of the slow transfer rate is that the original drive is NTFS formatted, which I believe tends to have slower read speeds on the RPis. This is also the main point of migrating the node data to an ext4 formatted drive.

Hoping that since the incoming data rate has been relatively low since the rsync process was started that the second, third, etc passes are much quicker.

Pac · August 21, 2020, 5:51pm

What’s your average transfer rate @dragonhogan? I migrated a node lately from a disk connected to my Raspberry Pi4, to a disk connected to a Laptop, over local network (1Gbps), and it was averaging around 30MB/s. Not lightning fast but your transfer would have been done in roughly 3.3 days at this rate.

(both of my disks are ext4 though)

dragonhogan · August 21, 2020, 6:19pm

Here’s a quick screen grab from the RPi 4 that shows the rsync “log” and the nmon disk I/O:

The node is in its 13th month now, so it’s got quite a bit of data…but looks like fairly low transfer rates, although I suppose that’s to be somewhat expected with such a large number of small files. I’m at 6.1TB transferred over now, and the drive has 167 power_on_hours from the smartctl readout. So that’s roughly averaging at a 10MB/s transfer rate…

I also expect that the original NTFS HDD may be fairly fragmented based on my understanding of how the NTFS file system works, so that’s may also be causing the process to move slowly. And again with the high egress rates we’re seeing at the moment, the disk is also keeping up with all of that traffic.
Of course I have both HDDs plugged into the two USB 3.0 ports on the RPI, although the USB control could be bottle-necking the process? Since the original drive is receiving/sending data for the running node and also sending data to the second drive and the second drive is receiving the data…just a hypothesis.

I also notice in the rsync log that when it gets down to ~1800 “transfers left” in the counter it kind of stalls out, I assume while it is “querying” the disk further, and then it starts back up with ~3500-4000 more transfers to complete in the counter. Then it runs through those until ~1800 “transfers left” and that cycle just continues.

On a related note, I did start up a netdata container earlier today just to take a look at some more detailed stats of the RPI, and I did notice that every so often I was seeing some relatively higher “I/O Wait” (between 10-30 values), although they were spikes and not constant values. I generally don’t leave netdata running in the background though because it tends to eat up a lot of system resources.

Like I’ve previously stated, I’m not necessarily trying to solve a problem. The rsync process is working, albeit slower than I had expected, and I’m just happy to have a method that allows my node to stay online. I’m more just thinking out loud and sharing my experience.

Appreciate the folks chiming in.

dragonhogan · August 25, 2020, 9:05pm

Just as an update. The rsync FINALLY just finished. Took almost 11 days to sync the ~8.7TB of data from original drive to new one…just went through and installed updates to the RPi system and did a vacuum of my dbs. Now restarted the node and started the second pass of the rsync. Really hoping that this one is a bit quicker.