Example, Clustered Instance of a Service or Application

Updated: October 24, 2008

Applies To: Windows Server 2008

In this example, the fictitious company A. Datum needs to make a database application available to thousands of employees doing critical work for the company. This application needs to be available 99.99% of the time, that is, with no more than 1 hour of downtime per year. A. Datum creates a failover cluster and configures the database application to run in the cluster, using the network name “Database1.” This means that either of two physical servers (two nodes in the failover cluster) can make Database1 available to clients at any given time. The two nodes in the failover cluster use very similar hardware, run the same version of Windows Server 2008, and have exactly the same software updates (patches).

This topic illustrates the following:

For examples that illustrate other designs, see Evaluating Failover Cluster Design Examples.

When the failover cluster begins providing service, the clustered database server, using a network name of Database1, is owned by Node 1. This is shown in the following diagram.

Clustered database server during normal operation

Node 1 uses a shared bus or iSCSI connection to the cluster storage and has ownership of the disk (or LUN) used by the database. Node 1 also uses a network to send regular signals, called “heartbeat” signals, to Node 2, and receives heartbeat signals from Node 2. In this way, both nodes have a way of determining whether the other node is functioning and whether it is able to communicate through the network. In addition, both nodes are also on at least one network that connects them to clients and to the administrator of the cluster.

Not shown in the diagram is the information stored in the cluster configuration that tells the cluster how to start and stop the database services in an orderly way. This information ensures that a database service on which another database service depends is started earlier than the other service. For example, if the database application has a core service on which all other database services depend, the core service will be started before any of the other database services.

At some point, Node 1 develops difficulties, and has almost stopped functioning. Node 1 loses the ability to send regular heartbeat signals across the network to Node 2. The following diagram shows the brief time just after Node 1 stops sending the signals.

Clustered database server at the start of failover

Shortly after heartbeat signals stop arriving from Node 1, Node 2 begins an orderly process of making Database1 available to clients. It brings online an appropriate IP address and the network name “Database1” (so clients can communicate as expected with the database) and obtains access to the appropriate disk (or LUN) in the cluster storage. Then it makes sure that the appropriate files and folders are available and starts database services in an orderly way. The result is that the database application becomes available to clients after a relatively short interruption of service. The following diagram shows failover occurring.

Clustered database server completing failover

A similar process can be initiated by a system administrator for scheduled downtime. For example, if Node 1 is running correctly and is the current owner of Database1, but software updates need to be applied to Node 1, the administrator can use the Failover Cluster Management snap-in to deliberately move Database1 to Node 2 so that the software updates can be applied. Of course, when applying software updates to a cluster node, it is important to apply the same updates to other cluster nodes as soon as possible. This ensures that all cluster nodes will consistently respond in the same way.

Community Additions