Node migration over wan

Yesterday I migrated a node from one siteI have to another site via WAN. To sync the data it took about 2 days (about 900GB transferred).

Everything was ok with the Rsync. The last steps I performed were:

  • shutdown the source node by doing “docker stop -t 300 storagenode”
  • issue the last Rsync (however I forgot to put the --delete). It took 10 min to sync the last data.

Then I started the node in the new place and i realized that, even the node was starting I was not showing properly. Then I checked the logs and saw that the database was malformed.

I restarted the node three or four times, just to see if this could fix the issue, but nothing.

Then I checked 2 things:

  • How to properly migrate a node (and realize about the --delete parameter)
  • How to fix “the database is malformed”, which is a nightmare.

I’d tried for issuing the rsync again but with --delete parameter now. After doing that, the node started well and all the information apparently was ok, showing all my data and so on.

My question is, being about 2 minutes or 3 in each try with the database malformed, can damage my node?

And the second question is, if after issuing the rsync with the --delete parameter it worked properly (even considering that the previous tries weren’t correct) this means that everything is ok or do i have to worry for something?

Just to clarify, it happened something similar to that:

Thanks!!

Your database was considered malformed because your last rsync without --delete copied a “good” database but the .wal files with the “temporary changes” were still there, which is likely what caused problems. Your final rsync --delete then got rid of those .wal files and the DBs were fine.

Unlikely, the worst that could have happened is some audits failing (but not sure that is still a problem). So if your audit/suspension scores are still fine, I’m sure your node will be fine.

1 Like

I’ve made the same mistake. I just ended up deleting the nodes is crewed up, but I agree with Kevink, may as well let it run and see what happens, maybe not enough is messed up to cause long term problems.

Thanks Kevink for your clear explanation. I will let it run and let’s see. Hope that everything is fine as it took about 3 months to fill this TB of data.

1 Like