Question: Moving node to new hardware

lbaker · December 15, 2022, 4:51pm

I’m having some odd ball issues with my node and from how it’s acting I dont think my raid mirror is keeping up with the write. I’ve seen a reboot of the server after patching take 30 minutes just to shut down as it (appears) to be flushing everything to the drives. SMART status on both drives is clean. Host is an OLD i5 machine running Ubuntu 18.04.6. I’m considering moving to node over to a very underutilized i7 running Ubuntu 22.04 to see how it runs there. That should give me a lot better idea if it’s the OS, the servers, or one of the drives acting up.

Question is… Will the system be ok with my public IP address changing? I should be albe to move everything else transparently and I dont think the actual public IP changing is an issue but want to verify before I take the downtime hit to move it to newer hardware.

Thanks

andrew2.hart · December 15, 2022, 5:29pm

Yes, as long as you set it correctly in the new settings and have a forward through whatever.

If you instead set the node size smaller than the space used, it will stop writing so much and maybe give you a chance to diagnose, maybe start with top or iotop

lbaker · December 15, 2022, 6:03pm

pretty sure it’s one of the drives thats slow, but I dont know if the drive has the issue or the controller on the motherboard… sdb and sdc are raid mirrors. C is keeping up fine but B is doing terrible…

andrew2.hart · December 15, 2022, 7:05pm

So you could “break” the mirror and take the “bad” disk to test on the i7 while keeping the good disk going.

You could start a 2nd node if the disk turns out good after all

Pac · December 15, 2022, 10:10pm

You wouldn’t happen to have an SMR drive?
That could explain a lot…

Alexey · December 16, 2022, 5:08am

If you would decide to move and use the DDNS hostname, make sure either stop DDNS updater on your router (or special DDNS application) before move and setup it on a new router, or register a new DDNS address, if you decide to setup a second node on the new server (please also update your port forwarding rules on both routers and update your ADDRESS option on the second server).

lbaker · December 17, 2022, 1:01pm

Bought a spare drive, Seagate Barracude like the original 2. The new drive is just as terrible as the one I replaced. For test I put the drive I removed on my real good i7 server and used dd to benchgmark the drive. It worked great for 2 hours, then the avio average started spiking up to about 7 SECONDS and the throughput went down significantly. Rebuild of the drive I replaced has been running for a day and a half and it not even up to 50%. Searching on the internet I found lots of people complaining about seagate drives with very similar issues. The only common fix seemed to be replace them with Toshiba drives! So I’ve ordered a set of Toshiba NAS pro drives. They should arrive tomorrow so my future holds 2 for 8TB raid rebuilds as I move to other drives. The Seagates are still under warranty (with one brand new) so they will be going back to the manufacturer. Maybe they will send me some that work and I can build a second machine. Moral is, dont by Seagate Baracude drives

Pac · December 17, 2022, 10:12pm

SMR drive typical behavior. And unless these are 1To, the following page does confirm your drives are SMR indeed:

You need CMR drives to solve that. Not necessarily Toshiba drives.

You can try, but I think they behave as expected, considering their technology.

lbaker · December 21, 2022, 7:00pm

Does look like they are SMR, but what I’ve read says that may slow down writes. These drives would only read at 10mb/s It took 2 days for a rsync to copy 1.9T from the seagate to the toshiba NAS replacement drives. In atop, the seagate was showing 100% busy just reading at 10Mb/s. The replacements averaged about 4% busy writing the data. Seagate rates these drive at 190MB/s read and they fail spectacularly!

Pac · December 21, 2022, 10:46pm

Copying millions of tiny files isn’t fast unless you’re cloning the disk, but that doesn’t explain why the target disk wasn’t busy at all though

I’m not sure how SMR drives handle priorities (and it probably varies from one disk to another) but if you were reading from it while it was still reorganizing itself internally, that could also explain its poor performances.
That said, one would think that reads should be prioritized over internal stuff unless writes are also being performed in parallel…

Just speculating.