HDD migration with same system (Raspberry Pi 4)

Hi, I am upgrading my HDD (from 8TB to 20TB) and I am a bit in a rush since the first HDD is completely full.
I tried following the official guide but I think there must be something very odd with my system as with rSync it was able to only copy 500GB in 4 days!!

This is the result of iostat (sdb being the new HDD):

Device r/s rkB/s rrqm/s %rrqm r_await rareq-sz w/s wkB/s wrqm/s %wrqm w_await wareq-sz d/s dkB/s drqm/s %drqm d_await dareq-sz f/s f_await aqu-sz %util
mmcblk0 2.73 42.24 0.39 12.39 2.91 15.46 9.82 50.86 0.93 8.67 23.81 5.18 0.00 0.00 0.00 0.00 0.00 0.00 0.40 0.70 0.24 1.28
sda 145.30 4465.33 701.74 82.85 10.37 30.73 20.06 1136.26 23.67 54.13 7.56 56.66 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 1.66 76.23
sdb 0.93 10.01 1.48 61.33 2.79 10.74 0.35 62.38 0.57 61.81 31.91 175.98 0.00 0.00 0.00 0.00 0.00 0.00 0.02 17.31 0.01 0.14

After a few seconds, the bottom line shows only zeros.
My system is a raspberry pi 4 and both HDD are CMR. One is plugged into the pi through a SATA to USB 3.0 adapters (powered externally) and another one is in a WD enclosure and attached to the pi via USB 3.0.

The result of lsusb -t gives the following:
/: Bus 02.Port 1: Dev 1, Class=root_hub, Driver=xhci_hcd/4p, 5000M
|__ Port 1: Dev 3, If 0, Class=Mass Storage, Driver=uas, 5000M
|__ Port 2: Dev 2, If 0, Class=Mass Storage, Driver=usb-storage, 5000M
/: Bus 01.Port 1: Dev 1, Class=root_hub, Driver=xhci_hcd/1p, 480M
|__ Port 1: Dev 2, If 0, Class=Hub, Driver=hub/4p, 480M

I believe bus 02 port 1 to be the source HDD and bur 02 port 2 being the new one.

Why is rSync so slow and is there any better/quicker way to migrate to a bigger HDD?

Piecestore uses millions of small files per TB: and if sda is 76.23% busy that could be all the drive/rpi can do for those random reads.

If you’re replacing the entire disk… you could skip rsync and use dd/ddrescue to copy everything (partitions/filesystems) at once. That makes the copy one big sequential transfer: which HDDs are much faster at. Then once it’s done you can resize your partitions, and then resize the filesystem too: to use all the new space.

2 Likes

Hi, thanks for the info Yes the source HDD is only used for storj. Do you have a guide somewhere on how to perform all the steps?

Look for any article/video about moving to a different drive with “dd” (example).

It may seem complicated: but essentially you’re connecting source and destination drives (unmounted)… using dd to pour one into the other as-fast-as-it-can… then using fdisk/gparted to make your partition larger… then using [tool-specific-to-filesystem-type] to expand the filesystem into the new space so the OS can use it.

So it’s actually only a couple commands, plus waiting for the copy to finish. Good luck!

2 Likes

Thanks, I am cloning it now. After it has finished, how would you check that the files are exactly identocal? Would you perform the last few official steps of rSync just to be sure?
And if there were some inconsistencies, how would the docker log file look like? Just to make sure I dont get disqualified afte all this effort.

Just spot check a few smaller dirs with

rsync -avn

and you’ll be fine :slight_smile:

Keep in mind that both hdd’s will have the same uuid when you clone them with dd. It will have some strange effects if you connect them both at the same time.

3 Likes

The other idea would be to use both disks and start a second node.

2 Likes

What would happen if I used the second HDD for a new node and gracefully exited from the first node? Would storj migrate everything to the second node which is physically attached to the same device because it would win every “race” for data?

There is no data transfer between nodes. Therefore second node would not benefit from living on same device.

1 Like

I thought that if a node exited gracefully, the data it holds needs to be put somewhere else, right?

Audit and repair process always takes care to have enough pieces online. Graceful exit only matters for node payment.

1 Like

All it does is tell the satellite “I’m going away in X days: if you need to repair any data that I’m holding part of… ask for it soon”. If everything the node is holding already has more than enough extra pieces on the network… none of the graceful-exiting node data may be asked for at all.

But if the node is holding data that has fewer-than-ideal pieces… the repair system will make it a priority to pull data from first.

Really? To my knowledge only payment system cares about graceful exit.

Well if I was storj I would keep the minimum amount of data replications that I deem safe in order to run my business (any excess would be paied as storage to an operator).
Under this logic, all the data that is stored on every single node that is gracefully exiting would be needed to be replicated somewhere else as soon as possible.

Given that the algorithm prefers quick transfers, it would choose the other node that is on the same IP address.

But perhaps my logic is flawed?

They do keep what they believe to be a safe minimum: but remember they aren’t worried about just one node going offline: they have to handle events like a local ISP having issues and perhaps a dozen nodes holding parts of the same data disappear - so they have a decent buffer. I don’t know what the latest RS numbers are (29/35/65/110?) but they’re in the forum.

So one graceful-exiting node isn’t an emergency.

1 Like

The second node has been up and running for 10 days and on the us1 satellite it passed over 150 audits, yet it is still not vetted. How is that possible?

It needs to be in the vetting process at least a month, this was an initial intention. Earlier 100 audits were enough, now seems it needs more.