Step 11: Design Fault Tolerance

Article
02/25/2008

Published: November 12, 2007 | Updated: February 25, 2008

In step 5, information was collected on the → requirements for all the workloads that IT will support in the virtual infrastructure. In this step, this information is used to design the fault tolerance solutions for server virtualization host hardware.

Task 1: Apply Load Balancing

For each application identified in step 5 that required network load balancing, determine how many additional VM instances of the application are needed, and then map them onto physical host systems in the same way as for the original VMs in step 9. Ideally, load balanced VMs should be distributed across multiple hosts to protect against the failure of a single host taking down all instances of the load-balanced application. Each load-balanced VM adds at least one more VM and often several to the list of VMs that need to be placed in the infrastructure.

Task 2: Plan for Host Clustering

In step 9, all the applications were mapped onto physical host servers. Count the number of physical servers that have applications requiring an MSCS cluster or host cluster as defined in step 5. This number represents the number of active cluster nodes required. Then, build an MSCS cluster plan that supports this defined number of active nodes. Each MSCS fail over cluster will require at least one additional physical host server to act as the passive node in the cluster.

Validating with the Business

As in preceding steps related to fault tolerance, it is important to validate decisions based on input from the business. Questions should include:

Are there nontechnical availability needs for specific applications? Some users might require a way to access applications while disconnected from the network or while working at occasionally connected remote sites. These considerations can affect the host fault tolerance design.
What are the future capacity needs for the virtual infrastructure? Conferring with the business will help the business plan for future expansion.
Will future application versions provide more availability options? Newer versions of applications might provide enhanced fault tolerance capabilities that can allow less expensive options to be used.

Decision Summary

Host clustering can provide a simple method of ensuring that many types of VMs remain available even after the failure of a host system. The costs of implementing host clustering, however, can be significant because of hardware and storage requirements. Additionally, unused capacity on standby nodes and servers lowers overall hardware resource use.

Additional Reading

Virtual Server Host Clustering Step-by-Step Guide for Virtual Server 2005 R2 at https://www.microsoft.com/downloads/details.aspx?FamilyID=09cc042b-154f-4eba-a548-89282d6eb1b3&displaylang=en provides information about implementing host-level clustering in Virtual Server 2005.
The Microsoft Knowledge Base article, “Requirements for configuring clustering in Virtual Server 2005,” at https://support.microsoft.com/kb/840192 provides implementation details for enabling clustering.

This accelerator is part of a larger series of tools and guidance from Solution Accelerators.