September 29, 2009
I'm sorry. I know you probably paid a lot for that license, but if your infrastructure is relying on a machine's ability to transition between VM hosts without rebooting as the crux of your high availability plan, you might want to reconsider.
Yesterday, Rational Survivability (a great all-over-the-place IT blog) had a post titled The Emotion of VMotion. It didn't occur to me before reading this that my own previous search for a hypervisor that would do live migration was working directly against my own beliefs that uptime should only matter for services. Essentially, the infrastructure should be designed so that a single server down doesn't contribute to the loss of availability.
That being said, live migration is a neat idea, and eventually it's going to get to the point that it's nearly instantaneous. When that happens, failovers will be next to invisible. Maybe we'll have to reevaluate our approach in that case.
Until then, I read posts from people trying to rely on it to keep their infrastructures up and I worry that their approach is flawed.
Please, build your services for reliability, not just the underlying systems.