Node migration to new hardware is slow

I’m trying to retire some old hardware, and my storagenode is (almost) the last thing I need to migrate off it. I figured just rsyncing all the data over to the new hardware would be the way forwards (and later found How do I migrate my node to a new device? | Storj Docs which confirmed my suspicions).

However, the transfer is much slower than ever I imagined; I’m now on my third progressive rsync of the live node, and it’s been running for about 36 hours. This leaves me with a couple of possible conclusions:

  1. The rate at which storage blobs change is very high. Does anyone know if this is likely to be the case?
  2. There is some frustrating bottleneck (which I’d guess is probably the underlying storage, iotop is showing fairly significant drive utilisation).

I’m rather concerned about the effect of the downtime when I actually switch off the node and do a final sync…

Answers to questions people probably have:

  • The node has about 8TB of data.
  • I cannot keep using the same drives in the new system and just move them across (new host is SFF, old host had a badly-chosen stripe across a mixture of drive types - don’t ask).
  • The connection is 5Gb, so shouldn’t be an issue.

Rsyncing a running node is terribly slow.
Rsync has 2 major disadvantages here:
1st is that with default settings it does not move files that have been moved on the source drive. It deletes them on the destination drive and re-uploads them to the new folder.
2nd is that it does not run parallel tasks.

What you could do is:

  1. Run several Rsync tasks in parallel for different folders.
  2. Exclude trash except for the last run
  3. Stop the source node so there are no file changes during rsync
    (4. Rsync has a feature to move files to a specified folder instead of deleting them. You could try it out, if you can get it to work: backup - Alternative of rsync --delete to move files to another directory instead of deleting - Unix & Linux Stack Exchange)

Something that is not mentioned in the migration guide (@Alexey, maybe it should be?) is that it helps to reduce the disk allocation to minimum possible. This stops uploads, so the dataset no longer changes so much.

2 Likes

the proposed options will not give a result, except for shutting down the node. however, keep in mind that if the node is started before rsync is complete, it will get even worse because the filewalker will start