Server cluster Architecture

Article
10/08/2009

Applies To: Windows Server 2003, Windows Server 2003 R2, Windows Server 2003 with SP1, Windows Server 2003 with SP2

This section discusses Server cluster and how to configure it for failover support for applications and services. Resource groups, cluster storage devices, network configuration and storage area networks are also discussed.

Server cluster

Server cluster is used to provide failover support for applications and services. A Server cluster can consist of up to eight nodes. Each node is attached to one or more cluster storage devices. Cluster storage devices allow different servers to share the same data, and by reading this data provide failover for resources.

Connecting Storage Devices

The preferred technique for connecting storage devices is fibre channel.

When using three or more nodes, fibre channel is the only technique that should be used.
When using 2-node clustering with Advanced Server, SCSI or fibre channel can be used to connect to the storage devices.

Configuring Server clusters

Server clusters can be setup using many different configurations. Servers can be either active or passive, and different servers can be configured to take over the failed resources of another server. Failover can take several minutes, depending on the configuration and the application being used, but is designed to be transparent to the end-user.

Server cluster and Failover

When a node is active, it makes its resources available. Clients access these resources through dedicated virtual servers.

Server cluster uses the concept of virtual servers to specify groups of resources that failover together. When a server fails, the group of resources configured on that server for clustering fails over to another server. The server that handles the failover should be configured for the extra capacity needed to handle the additional workload. When the failed server comes back online, Server cluster can be configured to allow failback to the original server, or to allow the current server to continue to process requests.

4706b314-4bd1-4b68-a486-dcfd82149bd5

Figure 6: Multi-node clusters with all nodes active

Figure 6 above shows a configuration where all nodes in a database cluster are active and each node has a separate resource group. With a partitioned view of the database, each resource group could handle different types of requests. The types of requests handled could be based on one or more factors, such as the name of an account or geographic location. In the event of a failure, each node is configured to fail over to the next node in turn.

Resource Groups

Resources that are related or dependent on each other are associated through resource groups. Only applications that need high availability should be part of a resource group. Other applications can run on a server cluster, but don't need to be a part of a resource group. Before adding an application to a resource group, IT staff must determine if the application can work within the cluster environment.

Cluster-Aware Applications. Applications that can work within the cluster environment and support cluster events are called cluster-aware. Cluster-aware applications can register with the Server cluster to receive status and notification information.

Cluster-Unaware Applications. Applications that do not support cluster events are called cluster-unaware. Some cluster-unaware applications can be assigned to resource groups and can be failed over.

Applications that meet the following criteria can be assigned to resource groups.

IP-based protocols are used for cluster communications. The application must use an IP-based protocol for their network communications. Applications cannot use NetBEUI, IPX, AppleTalk or other protocols to communicate.
Nodes in the cluster access application data through shared storage devices. If the application is not able to store its data in a configurable location, the application data will not be available on failover.
Client applications experience a temporary loss of network connectivity when failover occurs. If client applications cannot retry and recover from this, they will cease to function normally.

New Features for Resources and Resource Types

Windows Server 2003 adds new features for resources and resource types. A new resource type allows applications to be made cluster-aware using VBScript and JScript. Additionally, Windows Management Instrumentation (WMI) can be used for cluster management and event notification.

Architecting Resource Groups

When architecting resource groups, IT staff should list all server-based applications and services that will run in the cluster environment, regardless of whether they will need high availability. Afterward, divide the list into three sections:

Those that need to be highly available
Those that are not part of the cluster and on which clustered resources do not depend
Those that are running on the cluster servers that do not support failover and on which the cluster may depend.

Applications and services that need to be highly available should be placed into resource groups. Other applications should be tracked, and their interactions with clustered applications and services should be clearly understood. Failure of an application or service that is not part of a resource group should not impact the core functions of the solution being offered. If it does, the application or service may need to be clustered.

Note

In the case of dependent services that do not support clustering, IT staff may want to provide backup planning in case these services fail, or may want to attempt to make the services cluster-aware using VBScript and JScript. Remember that only Windows Server 2003 supports this feature.

Focus on selecting the right hardware to meet the needs of the service offering. A cluster model should be chosen to adequately support resource failover and the availability requirements. Based on the model chosen, excess capacity should be added to ensure that storage, processor and memory are available in the event a resource fails, and failover to a server substantially increases the workload.

With a clustered SQL Server configuration, IT staff should consider using high-end CPUs, fast hard drives and additional memory. SQL Server 2000 and standard services together use over 100 megabytes (MB) of memory as a baseline. User connections consume about 24 kilobytes (KB) each. While the minimum memory for query execution is one MB of RAM, the average query may require two to four MB of RAM. Other SQL Server processes use memory as well.

Optimizing Cluster Storage Devices

Cluster storage devices should be optimized based on performance and availability needs. While the Windows Datacenter Hardware Compatibility List provides a detailed list of acceptable Redundant Array of Independent Disks (RAID) configurations for clusters, Table 2 below provides an overview of common RAID configurations. The table entries are organized from the highest RAID level to the lowest.

Table 2 RAID Configurations

RAID Level	RAID Type	RAID Description	Advantages & Disadvantages
5+1	Disk striping with parity + mirroring	Six or more volumes, each on a separate drive, are configured identically as a mirrored stripe set with parity error checking.	Provides very high level of fault tolerance but has a lot of overhead.
5	Disk striping with parity	Three or more volumes, each on a separate drive, are configured as a stripe set with parity error checking. In the case of failure, data can be recovered.	Fault tolerance with less overhead than mirroring. Better read performance than disk mirroring.
1	Disk mirroring	Two volumes on two drives are configured identically. Data is written to both drives. If one drive fails, there is no data loss because the other drive contains the data. (Does not include disk striping.)	Redundancy. Better write performance than disk striping with parity.
0+1	Disk striping with mirroring	Two or more volumes, each on a separate drive, are striped and mirrored. Data is written sequentially to drives that are identically configured.	Redundancy with good read/write performance.
0	Disk striping	Two or more volumes, each on a separate drive, are configured as a stripe set. Data is broken into blocks, called stripes, and then written sequentially to all drives in the stripe set.	Speed/Performance without data protection.

Optimizing Network Configuration

The network configuration of the cluster can also be optimized. All nodes in a cluster must be a part of the same domain and can be configured as domain controllers or member servers. Ideally, multi-node clusters will have at least two nodes that act as domain controllers and provide failover for critical domain services. If this is not the case, the availability of cluster resources may be tied to the availability of the controllers in the domain.

Private and Public Network Addresses

Typically nodes in a cluster are configured with both private and public network addresses.

Private network addresses are used for node-to-node communications.
Public network addresses are used for client-to-cluster communications.

Some clusters may not need public network addresses and instead may be configured to use two private networks. In this case, the first private network is for node-to-node communications and the second private network is for communicating with other servers that are a part of the service offering.

Storage Area Networks

Increasingly, clustered servers and storage devices are connected over SANs. SANs use high-performance interconnections between secure servers and storage devices to deliver higher bandwidth and lower latency than comparable traditional networks. Windows 2000 Datacenter Server and Windows Server 2003 Datacenter Edition implement a feature called Winsock Direct that allows direct communication over a SAN using SAN providers.

SAN providers have user-mode access to hardware transports. When communicating directly at the hardware level, the individual transport endpoints can be mapped directly into the address space of application processes running in user mode. This allows applications to pass messaging requests directly to the SAN hardware interface, which eliminates unnecessary system calls and data copying.

SANs typically use two transfer modes. One mode is for small transfers, which primarily consist of transfer control information. For large transfers, SANs can use a bulk mode whereby data is transferred directly between the local system and the remote system by the SAN hardware interface without CPU involvement on the local or remote system. All bulk transfers are pre-arranged through an exchange of transfer control messages.

Other SAN Benefits

In addition to improved communication modes, SANs have other benefits.

They allow IT staff to consolidate storage needs, using several highly reliable storage devices instead of many.
They also allow IT staff to share storage with non-Windows operating systems, allowing for heterogeneous operating environments.