Cluster Architecture Essentials

Applies To: Windows Server 2003, Windows Server 2003 R2, Windows Server 2003 with SP1, Windows Server 2003 with SP2

This section introduces the concept of clustering and its benefits and limitations. It then goes on to discuss cluster organization, infrastructure scaling, cluster operating modes and how clustering is used on multiple, geographically dispersed sites.

The Concept of a Cluster

The concept of a cluster involves taking two or more computers and organizing them to work together to provide higher availability, reliability and scalability than can be obtained by using a single system. When failure occurs in a cluster, resources can be redirected and the workload can be redistributed. Typically, the end user experiences a limited failure, and may only have to refresh the browser or reconnect to an application to begin working again.

Cluster Benefits and Limitations

A Server cluster provides high availability by making application software and data available on several servers linked together in a cluster configuration. If one server stops functioning, a process called failover automatically shifts the workload of the failed server to another server in the cluster. The failover process is designed to ensure continuous availability of critical applications and data.

While clusters can be designed to handle failure, they are not fault tolerant with regard to user data. The cluster by itself does not guard against loss of a user's work. Typically, the recovery of lost work is handled by the application software; the application software must be designed to recover the user's work, or it must be designed in such a way that the user session state can be maintained in the event of failure.

Solving Three Typical Problems

Clusters can be used to solve three typical problems in a data center environment:

  • Need for High Availability. High availability refers to the ability to provide end user access to a service for a high percentage of scheduled time while attempting to reduce unscheduled outages. A solution is highly available if it meets the organization's scheduled uptime goals. Availability goals are achieved by reducing unplanned downtime and then working to improve total hours of service operation.

  • Need for High Reliability. High reliability refers to the ability to reduce the frequency of system failure, while attempting to provide fault tolerance in case of failure. A solution is highly reliable if it minimizes the number of single points of failure and reduces the risk that failure of a single component/system will result in the outage of the entire service offering. Reliability goals are achieved using redundant, fault tolerant hardware components, application software and systems.

  • Need for High Scalability. High scalability refers to the ability to add resources and computers while attempting to improve performance. A solution is highly scalable if it can be scaled up and out. Individual systems in a service offering can be scaled up by adding more resources (for example, CPUs, memory, disks, etc.). The service can be scaled out by adding additional computers.

A well-designed service solution uses redundant systems and components so that the failure of an individual server does not affect the availability of the entire service.

Limitations

While a well-designed solution can guard against application failure, system failure and site failure, cluster technologies do have limitations. Cluster technologies depend on compatible applications and services to operate properly. The software must respond appropriately when failure occurs. Cluster technology cannot protect against failures caused by viruses, software corruption or human error. To protect against these types of problems, organizations need solid data protection and recovery plans.

Cluster Organization

Clusters are organized in loosely coupled groups that are often referred to as farms or packs. In most cases, as shown below on Figure 1, front-end and middle-tier services are organized as farms using clones, while back-end and critical support services such as component routing are organized as packs.

IT Staff Considerations

As IT staff architect clustered solutions, they need to look carefully at the cluster organization they plan to use. The goal should be to organize servers according to the way the servers will be used and the applications they will be running. Typically, Web servers, application servers and database servers are all organized differently.

b7007e31-3c18-4843-a314-437c1621c1e4

Figure 1: Clusters are organized and farms or packs

Cluster Farm

A farm is a group of servers that run similar services, but do not typically share data. They are called a farm because they handle whatever requests are passed out to them using identical copies of data that is stored locally. Because they use identical copies of data (rather than sharing data), members of a farm operate autonomously and are also referred to as clones.

Front-end Web servers running Internet Information Services (IIS) and using NLB are an example of a farm. With a Web farm, identical data is replicated to all servers in the farm, and each server can handle any request that comes to it using local copies of the data. Because the servers are identical and the data is replicated to all the servers in the Web farm, the servers are also referred to as clones.

ExampleA Load Balanced Web Farm

In a load balanced Web farm with ten servers, you could have:

  • Clone 1Web server using local data

  • Clone 2Web server using local data

  • Clone 3Web server using local data

  • Clone 4Web server using local data

  • Clone 5Web server using local data

  • Clone 6Web server using local data

  • Clone 7Web server using local data

  • Clone 8Web server using local data

  • Clone 9Web server using local data

  • Clone 10Web server using local data

Cluster Pack

A pack is a group of servers that operate together and share partitioned data. They are called a pack because they work together to manage and maintain services. Because members of a pack share access to partitioned data, they have unique operations modes and usually access the shared data on disk drives to which all members of the pack are connected.

Example - A 4-node SQL Server Cluster Pack

An example of a pack is a database Server cluster running SQL Server 2000 and a server cluster with partitioned database views. Members of the pack share access to the data and have a unique chunk of data or logic that they handle, rather than handling all data requests.

In a 4-node SQL Server cluster:

  • Database Server 1 may handle accounts that begin with A-F.

  • Database Server 2 may handle accounts that begin with G-M.

  • Database Server 3 may handle accounts that begin with N-S.

  • Database Server 4 may handle accounts that begin with T-Z.

Combining Techniques A Large-Scale E-Commerce Site

Servers in a tier can be organized using a combination of the above techniques as well. An example of this combination is a large-scale e-commerce site that has middle-tier application servers running Application Center 2000 and CLB.

To configure CLB, two clusters are recommended:

  • The Component Routing Cluster handles the message routing between the front-end Web servers and the application servers.

  • The Application Server cluster activates and runs the components installed on the application servers.

While the Component Routing Cluster could be configured on the Web tier without needing additional servers, a large e-commerce site may want the high availability benefits of a separate cluster. In this case, the routing would take place on separate servers that are clustered using Server cluster. The application servers would then be clustered using CLB.

Infrastructure Scaling

With proper architecture, the servers in a particular tier can be scaled out or up as necessary to meet growing performance and throughput needs. Figure 2 below, provides an overview of the scalability of Windows clustering technologies.

IT Staff Considerations

As IT staff look at scalability requirements, they must always address the real business needs of the organization. The goal should be to select the right edition of the Windows operating system to meet the current and future needs of the project.

The number of servers needed depends on the anticipated server load, and the size and types of requests the servers will handle. Processors and memory should be sized appropriately for the applications and services the servers will be running, as well as the number of simultaneous user connections.

dee52359-7b6b-4076-bad8-dd21513da7a8

Figure 2: Windows cluster technologies can be scaled to meet business requirements

Scaling by Adding Servers

When looking to scale out by adding servers to the cluster, the clustering technology and the server operating system used are both important. As Table 1 below shows, the key difference in the outward scaling capabilities of Advanced Server and Datacenter Server is the number of nodes that can be used with Server cluster.

  • Under Windows 2000, the maximum number of Server cluster nodes is four.

  • Under Windows Server 2003, the maximum number of Server cluster nodes is eight.

Table 1 Cluster Nodes Supported by Operating System and Technology.

Operating System Edition Network Load Balancing Component Load Balancing Server cluster

Windows 2000

Advanced Server

32

8

2

Datacenter Server

32

8

4

Windows Server 2003

Enterprise Edition

32

8

8

Datacenter Edition

32

8

8

Scaling by Adding CPUs and RAM

When looking to scale up by adding CPUs and RAM, the edition of the server operating system used is extremely important.

In terms of both processor and memory capacity, Datacenter Server is much more expandable.

  • Windows 2000 Advanced Server supports up to eight processors and eight gigabytes (GB) of RAM.

  • Windows 2000 Datacenter Server supports up to 32 processors and 64 GB of RAM.

  • Windows Server 2003 Enterprise Edition supports up to eight processors and 32 GB of RAM.

  • Windows Server 2003 Datacenter Edition supports up to 32 processors and 64 GB of RAM.

Thus, organizations typically scale up from Advanced Server or Enterprise Edition to Datacenter Server or Datacenter Edition as their needs change over time.

Cluster Operating Modes

With NLB and CLB, cluster nodes usually are identical clones of each other. Because of this, all members of the cluster can actively handle requests, and they can do so independent of each other. When members of a cluster share access to data, however, they have unique operating requirements, as is the case with Server cluster.

IT Staff Considerations

As IT staff considers the impact of operating modes in their cluster architecture, they need to look carefully at the business requirements and the expected server loads.

With NLB and CLB, all servers are active and the architecture is scaled out by adding additional servers, which typically are configured identically to the existing NLB or CLB nodes.

With Server cluster, nodes can be either active or passive, and the configuration of nodes depends on the operating mode (active or passive), as well as how failover is configured. A server that is designated to handle failover must be sized to handle the workload of the failed and the current workload (if any). Additionally, both average and peak workloads must be considered. Severs need additional capacity to handle peak loads.

Server cluster Nodes

Server cluster nodes can be either active or passive.

  • Active Node. When a node is active, it is actively handling requests.

  • Passive Node. When a node is passive, it is idle, on standby waiting for another node to fail.

Multi-node clusters can be configured using different combinations of active and passive nodes.

Architecting Multi-node Clusters

When architecting multi-node clusters, the decision as to whether nodes are configured as active or passive is extremely important. To understand why, consider the following:

  • If an active node fails and there is a passive node available, applications and services running on the failed node can be transferred to the passive node. Since the passive node has no current workload, the server should be able to assume the workload of the failed server without any problems (providing all servers have the same hardware configuration).

  • If all severs in a cluster are active and a node fails, the applications and services running on the failed node can be transferred to another active node. Since the server is already active, the server will have to handle the processing load of both systems. The server must be sized to handle multiple workloads or it may fail as well.

In a multi-node configuration where there is one passive node for each active node, the servers could be configured so that under average workload they use about 50% of CPU and memory resources.

In the 4-node configuration depicted on Figure 3 below, where failover goes from one active node to a specific passive node, this could mean two active nodes (A1 and A2) and two passive nodes (P1 and P2)each with four processors and 4GB of RAM. In this example, node A1 fails over to node P1 and node A2 fails over to node P2 with the extra capacity used to handle peak workloads.

52e7ed0f-a83b-431a-908e-889443b8d4ed

Figure 3: Examples of Active/Passive and Active/Active configurations

In a multi-node configuration where there are more active nodes than passive nodes, the servers can be configured so that under average workload they use a proportional percentage of CPU and memory resources.

In the 4-node configuration illustrated on Figure 3 above, where nodes A, B, C, and D are configured as active and failover could go to between nodes A and B or nodes C and D, this could mean configuring servers so that they use about 25% of CPU and memory resources under average workload. In this example, node A could fail over to B (and vice versa) or node C could fail over to D (and vice versa).

Because the servers in this example would need to handle two workloads in case of a node failure, the CPU and memory configuration would at least be doubled, so instead of using four processors and four GB of RAM, the servers may use eight processors and eight GB of RAM.

Shared-Nothing Database Configuration

When Server cluster has multiple active nodes, data must be shared between applications running on the clustered servers. In most cases, this is handled with a shared-nothing database configuration.

In a shared-nothing database configuration, the application is partitioned to access private database sections. This means that a particular node is configured with a specific view into the database that allows it to handle specific types of requests, such as account names that started with the letters A-F, and that it is the only node that can update the related section of the database. This eliminates the possibility of corruption from simultaneous writes by multiple nodes.

Note

Both Microsoft Exchange 2000 and Microsoft SQL Server 2000 support multiple active nodes and shared-nothing database configurations.

Multiple Sites and Geographically Dispersed Clusters

Most organizations build disaster recovery and increased availability into their infrastructure using multiple physical sites. Multi-site architecture can be designed in many ways. In most cases, the architecture has a primary site and one or more remote sites. Figure 4 below shows an example of a primary site and a remote site for an e-commerce operation.

The architecture at the remote site mirrors that of the primary site. The level of integration for multiple sites and the level at which components are mirrored between sites, depends on the service-level agreement and the business requirements.

Full Implementation Design

With a full implementation, the complete infrastructure of the primary site could be recreated at remote sites. This allows for a remote site to operate independently, or to handle the full load of the primary site if necessary. In this case, the design should incorporate real-time replication and synchronization for databases and applications.

Real-time replication ensures a consistent state for data and application services between sites. If real-time updates are not possible, databases and applications should be replicated and synchronized as rapidly as possible.

6430a1da-3ab0-4c5b-9bd5-05c79df7825f

Figure 4: Multiple site architecture

Partial Implementation Design

With a partial implementation only essential components are installed at remote sites to:

  • Handle overflow in peak periods.

  • Maintain uptime on a limited basis in case the primary site fails.

  • Provide limited services as needed.

Replicating static content on Web sites and read-only data from databases. This partial implementation technique would allow remote sites to handle requests for static content and other types of data that is infrequently changed. Users could browse sites, access account information, product catalogs, and other services. If they needed to access dynamic content or modify information (add, change, delete, etc.), the sites geographical load balancers could redirect users to the primary site.

Implement all layers of the infrastructure, but with fewer redundancies in the architecture, or implement only core components, relying on the primary site to provide the full array of features. With either of these partial implementation techniques, the design may need to incorporate near real-time replication and synchronization for databases and applications. This ensures a consistent state for data and application services.

Geographically Dispersed Clusters

A full or partial design could also use geographically dispersed clusters running Server cluster. Geographically dispersed clusters use virtual LANs to connect storage area networks (SANs) over long distances.

  • A VLAN connection with latency of 500 milliseconds or less ensures that cluster consistency can be maintained.

  • If the VLAN latency is over 500 milliseconds, the cluster consistency cannot be easily maintained.

Geographically dispersed clusters are also referred to as stretched clusters and are available in Windows 2000 and Windows Server 2003.

Majority Node Clustering

Windows Server 2003 offers many improvements in the area of geographically dispersed clusters, including a new type of quorum resource called a majority node set. Majority node clustering changes the way the cluster quorum resource is used. This allows cluster servers to be geographically separated while maintaining consistency in the event of node failure.

With a standard cluster configuration, as illustrated on Figure 5 below, the quorum resource writes information about all cluster database changes to the recovery logs; this ensures that the cluster configuration and state data can be recovered. The quorum resource resides on the shared disk drives and can be used to verify whether other nodes in the cluster are functioning.

8c853111-5144-49b5-8a03-fde7ab44950f

Figure 5: Comparing local and geographically dispersed clusters

With a majority node cluster configuration in Windows Server 2003, the quorum resource is configured as a majority node set resource. This new type of quorum resource allows the quorum data, which includes cluster configuration changes and state information, to be stored on the system disk of each node in the cluster. Because the data is localized, even though the cluster itself is geographically dispersed, the cluster can be maintained in a consistent state.

As the name implies, the majority of nodes must be available for this cluster configuration to operate normally. Should the cluster state become inconsistent, IT staff can force the quorum to get a consistent state. An algorithm also runs on the cluster nodes to help ensure the cluster state.