So my drive hasn’t failed or started to fail yet but I have a few questions.
How to properly check for a failing drive?
What do I do if my HDD starts to fail/has failed?
Can the data be transferred from a dying drive to a new working one?
This next one is a bit confusing but I will describe it the best I can.
4. If the drive is failing, can it be replaced with a new drive of the same identity, start accepting data to the new drive and THEN manually copy over the old data from the dying drive to the new drive?
No. You need to do it the other way around – first copy bulk of data, then stop old node, sync differences, and then start migrated node. See link above
You may check its S.M.A.R.T. from time to time, it’s a good way to get aware of the expected failure.
If the disk would start to lose data, you better to migrate it to a new drive and perhaps do this offline so as not to increase the load on the failing disk. You can be offline up to 12 days before suspension, the online score will recover in the next 30 days online.
If the node would lose more than 4% of stored data, it will be disqualified.
no, if you start the node with the same identity but without data, it will be disqualified for lost data.
So you need to transfer all data to a new drive before you bring it online on a new drive. But you may transfer the data to a new drive while the old one is running, but in the case of a failing drive it could be dangerous, because it will increase the load on the already failing disk.
If the drive is close to full, cloning is preferable—will be faster and you can use recovery tools like ddrescue.
If the drive is close to empty, copying files will be faster.
Yes, you have to preserve the identity. Without the identity the node cannot present itself to the satellite, and so satellite would not know what data the node is expected to hold. Identity files are small enough to keep a backup on other media, and this is a recommended action to take.
You would need to remove the disk from the enclosure and connect to the PC via SATA, or use better usb to sata enclosure; cheap USB enclosures don’t support SMART and many other features. Adpaters on VIA chipsets, like VL716 tend to work, some popular ASMedia based ones (forgot the actual chipset name), including early revisions of StarTech cables — tend not to.
Btw if you get such cable/enclosure, and decide to connect it to pi— either use USB2 ports or disable would likely need to use usb-storage.quirks to disable UAS.
These enclosures are rebrands of cheap Chinese bottom of the barrel crap. In particular, Ugreen brand specifically tends to be most overpriced and worst engineered garbage. But I digress.
What does lsusb report?
The trickery disabling UAS to make SMART work not always works, but you can try. It’s a good idea to disable it anyway on RPi4.
You would need to take vendorID:deviceID as reported by lsusb, remount boot volume as read/write sudo mount -o remount,rw /boot and add usb-storage.quirks=<vendorID>:<deviceID>:u to first and only line in the /boot/cmdline.txt
Doesn’t matter There are just a handful of chipsets with handful of firmware versions around. The approach is the same regardless of the wording on the box