Node with LVM, how to replace disk?

ccristian2003 · February 12, 2020, 8:52am

My node is using LVM on Ubuntu server with 3 disks 1TB each. Since I am getting almost full I would replace / migrate one of the disk with a larger one (4TB).

The disks setup is the following:
sdb 8:16 0 931.5G 0 disk
└─jbod_group-storj_node 253:0 0 2.7T 0 lvm /storj_jbod
sdc 8:32 0 931.5G 0 disk
└─sdc1 8:33 0 931.5G 0 part
└─jbod_group-storj_node 253:0 0 2.7T 0 lvm /storj_jbod
sdd 8:48 0 931.5G 0 disk
└─jbod_group-storj_node 253:0 0 2.7T 0 lvm /storj_jbod

Is there any simple way of replacing one of the above disks?

I have read about ONLINE pvmove / pvreduce but I am worried about the duration of the process and not to corrupt the data.

Also I am thinking to stop the node, put one of the disks on another system, do “dd” command to copy on the new disk and start the node with the new disk - this is faster but requires downtime and it is also risky to some extent.

The third option is to mirror everything on the new disk, I did not explore too much this option to know the exact steps.

smurfix · February 12, 2020, 9:42am

“pvmove” is quite safe. It essentially does the same thing as mirroring a complete disk and then taking the original offline, except not all at once.

ccristian2003 · February 12, 2020, 10:17am

Any estimate on how long it will take to pvmove, let’s say 900GB? Of course system specs are needed:
AMD E-350 dual core (similar performance with Intel Atom), 4GB DDR3, sata3.

What if during pvmove power outage occurs and system is restarted, can be resumed / resumes automatically or I am lost?

donald.m.motsinger · February 12, 2020, 12:21pm

Are you aware of that with one drive failure all data in your LVM will be gone? Just asking…

ccristian2003 · February 12, 2020, 1:55pm

I understand the point that the risk of failure of one of the 3 LVM disks is higher than 1 disk setup but in both the cases the data is gone in case of a failure.

donald.m.motsinger · February 12, 2020, 2:46pm

No, if you set up 1 node per disk, then only one node will be lost. It is recommended to run one node per disk unless you have a RAID setup.

ccristian2003 · February 12, 2020, 3:02pm

well this is what I actually intend to do, migrate everything to one bigger disk and reduce the 3 x 1TB disks

cdhowie · February 12, 2020, 5:05pm

The way you’d usually do this is:

pvcreate on the new disk.
vgextend the VG to add the new disk.
pvmove $sourcepv $destpv for each PV you want to vacate. All allocations on the source will be moved to the destination. (You may want to run lvs -o +devices to see which slices of the LV exist on each PV and move them in order to reassemble the LV as contiguous on the new PV.)

As others have said, pvmove is safe. We use it on production systems. If the move fails or is aborted, the origin will still be fully intact and available (assuming the failure wasn’t because the origin disk didn’t fail). It is safe to use on a mounted volume, though note that disk I/O will increase so you will likely see increased latency and throughput on the volume until the move completes. This should not affect your node’s reputation or disqualify it, but you may see more upload/download failures than usual while the volume is being moved.

Once this is done, the old PVs can be removed from the VG (vgreduce) or you can allocate new LVs on them.

ccristian2003 · June 11, 2020, 10:48am

I think I did a big mistake here, I did a lvextend and resize2fs after vgextend and now I get error when try to pvmove - No extents available for allocation.

I tried to shrink the file system but is an offline job and takes ages. Any thoughts how can I remove a 1TB disk now that I have plenty of space?

Is there a way to migrate the data in LVM from one disk to another?

cdhowie · June 11, 2020, 11:08am

It sounds like you accidentally expanded the volume to fill all of the free space in the volume group. Indeed, you can’t shrink ext2/3/4 online.

If you want to avoid downtime, you need a temporary disk/array as big as the used space on the volume.

rsync the data to this other volume (while the node is running).
Repeat the rsync another time while the node is running.
Shut down the node and run rsync one more time.
Point the node at the temporary storage and start it back up.
Either:
- Delete all of the Storj data on the old volume and then shrink it, or
- Delete the LV entirely and recreate it.
Do the rsync dance again to get the data back where it was: two rsyncs while the node is running, stop the node, do a final rsync, point the node at the new storage location, and start it back up.

As a side note, consider setting the cling allocation policy on your volume group. This makes it harder to accidentally expand a volume onto a second PV.

ccristian2003 · June 11, 2020, 3:59pm

Unfortunately I do not have another sata port available to plug in another disk.

I have used “lvresize --resizefs --size -1000GB” and I could free filesystem for one disk.
This is offline but is affordable - takes around 10 minutes.
With the storagenode up I did pvmove for 1TB disk and took around 3hrs.
Now I am doing pvmove for the second (the third will follow) 1TB disk.
In the end I will have only one 4TB disk for storage as I have planned.

After this 4TB disk will fill in, recommendation is to start a second container / storagenode for a new disk on the same machine? Or I need to setup a new hardware / VM?

Alexey · June 12, 2020, 9:52pm

You can use the same machine if it has enough resources.