Apr 26, 2007

There seems to be a bit of confusion about the benefits of server virtualization, with many tending to focus on cost savings. As a district that has been running a virtual infrastructure for some time, I can honestly say that virtualization is not so much about saving money (although you certainly will) as it is about better resource utilization, more reliability, and greater flexibility.

Better resource utilization

There is no question that most of our servers are doing nothing about 90% of the time. This becomes quite obvious with even a cursory glance at historical utilization data for any given server. It would seem that the obvious solution for this would be to simply run more applications on each one, but the reality of this is that the more apps you install on one OS, the more unreliable it becomes (especially if it's a Microsoft product.) So, what we all do instead is buy a new machine every time we want a new app that we think is "critical," because we want to be sure it has its own sandbox to play in. Don't even tell me you are running anything but Exchange on your Exchange servers!

So, we find ourselves with racks and racks of servers consuming more and more space (at a cost,) all generating heat which we must cool (at a cost,) all pulling more and more power (at a cost,) all requiring more and more time to manage (at a cost.)

Virtualization offers a way to safely put more than one operating system (or virtual server) on one piece of hardware by isolating each operating system from any others running on the box. Essentially, you are establishing a bunch of sandboxes on one piece of hardware. If one of the virtual servers crashes, hard or soft, it will have no impact on any of the others on the box. Hardware resources are better used since, rather than having 10 independent servers running at 10 percent utilization each, you can have 2 running 5 virtual servers each for a total of 50 percent utilization per box. Better still, if designed properly (more on that later,) should a virtual server require more resources, it can easily and instantly be moved to a machine that offers more, often live and transparently to its end users.

More reliability

It's important to note, before any discussion on reliability comes into play, that a virtualized operating system is, by nature, relatively hardware agnostic. This means that it (its image, that is) can easily be moved from one piece of hardware to another, even if that hardware is of completely different design, without modification and often without shutting the system down (ie live migration.) This can dramatically reduce the time required to bring a failed system back up, as the typical 2-4 hour OS reinstall phase can be eliminated.

However, virtualization, by its very design, dramatically increases the impact of a single system failure, as a variety of services will be impacted when multiple virtual servers go down simultaneously. This is where the "designed properly" comes into play.

In the simplest sense, it can be argued that less total hardware means a lower likelihood of failure, but this argument doesn't get very far with me. Too often, people virtualize individual machines that were never designed to be virtualized, using their internal storage as a repository for their virtual machine images. Basically, they set themselves up for a huge catastrophe should a hardware failure occur. This sort of setup is actually less reliable than individual machines, and introduces more risk than one should be willing to accept in a mission critical environment.

Properly designed, however, a virtualized infrastructure can provide far greater reliability and less down time than an infrastructure of individual machines could ever achieve. The keys to the design are redundancy and shared storage. All individual pieces of server hardware must be redundantly linked to a properly designed SAN or other shared storage device, where all virtual machine images are stored for a user to realize the true reliability benefits of server virtualization.

Hardware agnosticism combined with reliable shared storage creates a powerful combination of resources for increased uptime. Let me give you a real world example in a story from our district:

Three or four months after we had completed the virtualization of our infrastructure onto a blade system attached to shared storage (fiber channel SAN,) one of the fiber links to the SAN failed on one of the blades. Since we had redundant links, the backup link picked right up and the blade continued to run, but we still had a hardware issue to deal with. Had we been in a non-virtualized environment, we would have taken the all too common path of: 1) alerting our users that we were going to be taking the system down after hours, 2) shutting down the server and, unfortunately, all its services for a time, 3) pulling the box out to effect repairs and/or reinstalls/restores, 4) reinstalling the box in the rack and booting the system, 5) letting everyone know its back up after a potentially significant period of down time (and that we're truly sorry for the inconvenience,) and 6) not spending time with our families.

However, since we were in a properly designed, virtualized environment, we simply: 1) migrated the virtual machines (four of them) live to other blades, right in the middle of the day, 2) shut down and pulled the bad blade, 3) replaced the part, 4) reinstalled the blade and fired it up, 5) live migrated the virtual machines back to the blade, and 6) arrived at home on time that night to have dinner with our families. And best of all, we didn't have to apologize to anybody.

Such a reliable design can even make one rethink the need for fail-over clustering, which I know many of us use. In many cases, clustering may simply be unnecessary. But just think of what you can do in a clustered environment with the ability to move cluster nodes from box to box...

Greater flexibility

Finally, and perhaps most importantly, virtualization provides flexibility, or what I like to call, an agile infrastructure. I've already described some of that flexibility in the reliability section - moving virtual machines live from box to box. Imagine, for example that one of your virtual machines is consuming too many resources on the box it's on, lets say processor time. People are complaining that things are slowing down. You say, "no problem," and move the virtual machine to a box with a faster processor. Or, you take advantage of virtual smp, and simply pin another processor to the virtual machine. Ever needed to add more RAM to a server because a process has outgrown its allocation? No problem - simply allocate more RAM to the process. No pulling the server, no extended periods of down time. Xen virtualization technologies will even let you pin additional CPUs and change RAM configurations live, without taking the virtual machine down.

Deployments are equally easy. Once you have one image of an OS, you know that it will work on any hardware, so you never have to sit and watch an installer run, followed by endless online updates again. Simply copy the image and fire it up - you're ready to install that new app in less than 5 minutes. How much are you paying people to do this sort of thing, when they could be working on more important things, like innovating!

Conclusion

All in all, I hope this has served to help clarify the value of virtualization. There are certainly many other benefits and issues that could be discussed, but I think I'll head home instead, and spend time with my family.

Cheers!

9 comments:

Post a Comment
  1. Very interesting indeed, it summarizes well the benefits of virtualization, especially backed with some field experience.I can't help thinking about how much can live-portable-virtual machines change things, like moving a virtual machine (and its services) from one place to the opposite place on the globe (with high-speed networks), in the case of a huge disaster or simply because the customers of the service(s) have moved geographically (being closer to the needs implies less network use). Hardware independancy is a precondition to this, and will surely find other usage.Thanks for this article.

  2. Very nicely written and thought out. Thanks.

  3. Superb article! I will be migrating all servers and your article has helped a lot.

  4. This is the 101 I have been looking for. Great article. Goes beyond 101.

  5. I had no idea what Virtualization is for, until now.Thanks for the article

  6. Love this article, thank you for the insight.

  7. Yes, it is truly amazing to have access to such important summarized information like this on the fly. Very well laid out in an easy to understand format. Thanks again

  8. Feel free!

  9. Any objections for me to post a link to your article?

Post a Comment