Hi, I am upgrading my HDD (from 8TB to 20TB) and I am a bit in a rush since the first HDD is completely full.
I tried following the official guide but I think there must be something very odd with my system as with rSync it was able to only copy 500GB in 4 days!!
This is the result of iostat (sdb being the new HDD):
After a few seconds, the bottom line shows only zeros.
My system is a raspberry pi 4 and both HDD are CMR. One is plugged into the pi through a SATA to USB 3.0 adapters (powered externally) and another one is in a WD enclosure and attached to the pi via USB 3.0.
The result of lsusb -t gives the following:
/: Bus 02.Port 1: Dev 1, Class=root_hub, Driver=xhci_hcd/4p, 5000M
|__ Port 1: Dev 3, If 0, Class=Mass Storage, Driver=uas, 5000M
|__ Port 2: Dev 2, If 0, Class=Mass Storage, Driver=usb-storage, 5000M
/: Bus 01.Port 1: Dev 1, Class=root_hub, Driver=xhci_hcd/1p, 480M
|__ Port 1: Dev 2, If 0, Class=Hub, Driver=hub/4p, 480M
I believe bus 02 port 1 to be the source HDD and bur 02 port 2 being the new one.
Why is rSync so slow and is there any better/quicker way to migrate to a bigger HDD?
Piecestore uses millions of small files per TB: and if sda is 76.23% busy that could be all the drive/rpi can do for those random reads.
If you’re replacing the entire disk… you could skip rsync and use dd/ddrescue to copy everything (partitions/filesystems) at once. That makes the copy one big sequential transfer: which HDDs are much faster at. Then once it’s done you can resize your partitions, and then resize the filesystem too: to use all the new space.
Look for any article/video about moving to a different drive with “dd” (example).
It may seem complicated: but essentially you’re connecting source and destination drives (unmounted)… using dd to pour one into the other as-fast-as-it-can… then using fdisk/gparted to make your partition larger… then using [tool-specific-to-filesystem-type] to expand the filesystem into the new space so the OS can use it.
So it’s actually only a couple commands, plus waiting for the copy to finish. Good luck!
Thanks, I am cloning it now. After it has finished, how would you check that the files are exactly identocal? Would you perform the last few official steps of rSync just to be sure?
And if there were some inconsistencies, how would the docker log file look like? Just to make sure I dont get disqualified afte all this effort.
Keep in mind that both hdd’s will have the same uuid when you clone them with dd. It will have some strange effects if you connect them both at the same time.
What would happen if I used the second HDD for a new node and gracefully exited from the first node? Would storj migrate everything to the second node which is physically attached to the same device because it would win every “race” for data?
All it does is tell the satellite “I’m going away in X days: if you need to repair any data that I’m holding part of… ask for it soon”. If everything the node is holding already has more than enough extra pieces on the network… none of the graceful-exiting node data may be asked for at all.
But if the node is holding data that has fewer-than-ideal pieces… the repair system will make it a priority to pull data from first.
Well if I was storj I would keep the minimum amount of data replications that I deem safe in order to run my business (any excess would be paied as storage to an operator).
Under this logic, all the data that is stored on every single node that is gracefully exiting would be needed to be replicated somewhere else as soon as possible.
Given that the algorithm prefers quick transfers, it would choose the other node that is on the same IP address.
They do keep what they believe to be a safe minimum: but remember they aren’t worried about just one node going offline: they have to handle events like a local ISP having issues and perhaps a dozen nodes holding parts of the same data disappear - so they have a decent buffer. I don’t know what the latest RS numbers are (29/35/65/110?) but they’re in the forum.
So one graceful-exiting node isn’t an emergency.
The second node has been up and running for 10 days and on the us1 satellite it passed over 150 audits, yet it is still not vetted. How is that possible?