Achieving Fault Tolerance by Using Clustering

Article
10/08/2009

Applies To: Windows Server 2003, Windows Server 2003 R2, Windows Server 2003 with SP1, Windows Server 2003 with SP2

Many organizations require that critical data be continuously available. Cluster technology provides a means of configuring storage to help meet that goal. Simply put, a cluster is two or more computer systems that act and are managed as one. Clients access the cluster by using a single host name or IP address; their request is answered by one of the systems in the cluster.

The purpose of cluster technology is to eliminate single points of failure. When availability of data is your paramount consideration, clustering is ideal. Using a cluster avoids all of these single points of failure:

Network card failure
Processor failure
Motherboard failure
Power failure
Cable failure
Storage adapter failure

With a cluster, you can essentially eliminate nearly any hardware failure associated with using a single computer. If hardware associated with one system fails, the other system automatically takes over. Two types of clustering solutions that accomplish this are server clusters and Network Load Balancing clusters. Both types of clustering are available on Windows Server 2003, Enterprise Edition and Windows Server 2003, Datacenter Edition. In addition, Network Load Balancing clusters are available on Windows Server 2003, Web Edition and Windows Server 2003, Standard Edition.

Server Clusters

Server clusters are often implemented to offer high availability solutions to applications that need both read and write access to data, such as database, e-mail, and file servers. Server clusters can be configured with up to eight computers, or nodes, participating in the cluster. To share the same data source, server cluster nodes connect to external disk arrays by using either a SCSI or Fibre Channel connection. Fibre Channel is required for interconnecting clusters of three or more nodes to shared storage. For the 64-bit versions of Windows Server 2003, you must always use Fibre Channel hardware to connect the nodes to shared storage.

When planning to deploy server clusters in your storage solution, you must take into account the following considerations:

The boot and system disks of each cluster node must not be located on the same storage bus as the shared storage devices, unless you use a Storport driver for your HBAs.
Shared cluster disks cannot be configured as dynamic disks.
Shared cluster disks must be formatted as basic disks with the NTFS file system.
For the 64-bit versions of the Windows Server 2003 family, the shared cluster disks must be partitioned as master boot record (MBR) and not as GUID partition table (GPT) disks.
You cannot use Remote Storage with shared cluster storage.
You should not enable write caching on shared cluster disks unless they are logical units on an external RAID subsystem that has proper power protection (such as multiple power supplies, multiple feeds from the power grid, or adequate battery backup).
Because cluster disks must be basic disks, you cannot use software RAID. For disk fault tolerance, you must use a hardware-based RAID solution.

For more information about server clusters, see "Designing and Deploying Server Clusters" in this book.

Network Load Balancing Clusters

Network Load Balancing clusters maintain their own local copy of data and are ideal for load balancing access to static data, such as Web pages. Up to 32 computers can participate in a Network Load Balancing cluster. Because they manage their own local data, Network Load Balancing clusters are much easier to plan and implement. By using the Network Load Balancing Manager, you can quickly configure all Network Load Balancing clusters in your enterprise from a single server.

Using Network Load Balancing clusters is the best choice for several data availability needs. For any server that has difficulty meeting the load demands of its clients, Network Load Balancing is an ideal solution, and is commonly used to provide fault tolerance and load balancing for:

Web Servers
FTP Servers
Streaming Media Servers
VPN Servers
Terminal Servers

For each of these, Network Load Balancing is ideal, not only because it is easy to implement, but also because of how easily a Network Load Balancing cluster can scale as your company grows. Because of their simple scalability, your initial estimates of the number of servers you will require need not be perfect with Network Load Balancing clusters. As the load on a Network Load Balancing cluster grows, you can balance the increased load by simply adding additional hosts to the cluster.

Because each Network Load Balancing cluster host maintains its own local copy of storage, storage planning with Network Load Balancing clusters is not as complex as with server clusters. Many of the disk restrictions of server clusters do not apply to Network Load Balancing clusters. The general storage considerations for Network Load Balancing cluster planning are:

Network Load Balancing cluster hosts can use any local storage space, including space on boot or system volumes.
Local storage can consist of basic and dynamic disks.
Hardware or software RAID can be used to add additional fault tolerance. If the Network Load Balancing cluster services a high level of traffic and you need disk fault tolerance, you must use hardware RAID.

For more information about Network Load Balancing clusters, see "Designing Network Load Balancing" and "Deploying Network Load Balancing" in this book.

Achieving Fault Tolerance by Using Clustering

Server Clusters

Network Load Balancing Clusters

Additional resources