The most basic type of cluster is a two-node cluster with a single quorum device. For a definition of a single quorum device, see “What Is a Server Cluster?.” The following figure illustrates the basic elements of a server cluster, including nodes, resource groups, and the single quorum device, that is, the cluster storage.
Applications and services are configured as resources on the cluster and are grouped into resource groups. Resources in a resource group work together and fail over together when failover is necessary. When you configure each resource group to include not only the elements needed for the application or service but also the associated network name and IP address, then that collection of resources runs as if it were a separate server on the network. When a resource group is configured this way, clients can consistently get access to the application using the same network name, regardless of which node the application is running on.
The preceding figure showed one resource group per node. However, each node can have multiple resource groups. Within each resource group, resources can have specific dependencies. Dependencies are relationships between resources that indicate which resources need to come online before another resource can come online. When dependencies are configured, the Cluster service can bring resources online or take them offline in the correct order during failover.
The following figure shows two nodes with several resource groups in which some typical dependencies have been configured between resources. The figure shows that resource groups (not resources) are the unit of failover.
The Cluster service runs on each node of a server cluster and controls all aspects of server cluster operation. The Cluster service includes multiple software components that work together. These components perform monitoring, maintain consistency, and smoothly transfer resources from one node to another.
Diagrams and descriptions of the following components are grouped together because the components work so closely together:
-
Database Manager (for the cluster configuration database)
-
Node Manager (working with Membership Manager)
-
Failover Manager
-
Global Update Manager
Separate diagrams and descriptions are provided of the following components, which are used in specific situations or for specific types of applications:
-
Checkpoint Manager
-
Log Manager (quorum logging)
-
Event Log Replication Manager
-
Backup and Restore capabilities in Failover Manager
Diagrams of Database Manager, Node Manager, Failover Manager, Global Update Manager, and Resource Monitors
The following figure focuses on the information that is communicated between Database Manager, Node Manager, and Failover Manager. The figure also shows Global Update Manager, which supports the other three managers by coordinating updates on other nodes in the cluster. These four components work together to make sure that all nodes maintain a consistent view of the cluster (with each node of the cluster maintaining the same view of the state of the member nodes as the others) and that resource groups can be failed over smoothly when needed.
Basic Cluster Components: Database Manager, Node Manager, and Failover Manager
The following figure shows a Resource Monitor and resource dynamic-link library (DLL) working with Database Manager, Node Manager, and Failover Manager. Resource Monitors and resource DLLs support applications that are cluster-aware, that is, applications designed to work in a coordinated way with cluster components. The resource DLL for each such application is responsible for monitoring and controlling that application. For example, the resource DLL saves and retrieves application properties in the cluster database, brings the resource online and takes it offline, and checks the health of the resource. When failover is necessary, the resource DLL works with a Resource Monitor and Failover Manager to ensure that the failover happens smoothly.
Resource Monitor and Resource DLL with a Cluster-Aware Application
Descriptions of Database Manager, Node Manager, Failover Manager, Global Update Manager, and Resource Monitors
The following descriptions provide details about the components shown in the preceding diagrams.
Database Manager
Database Manager runs on each node and maintains a local copy of the cluster configuration database, which contains information about all of the physical and logical items in a cluster. These items include the cluster itself, cluster node membership, resource groups, resource types, and descriptions of specific resources, such as disks and IP addresses. Database Manager uses the Global Update Manager to replicate all changes to the other nodes in the cluster. In this way, consistent configuration information is maintained across the cluster, even if conditions are changing such as if a node fails and the administrator changes the cluster configuration before that node returns to service.
Database Manager also provides an interface through which other Cluster service components, such as Failover Manager and Node Manager, can store changes in the cluster configuration database. The interface for making such changes is similar to the interface for making changes to the registry through the Windows application programming interface (API). The key difference is that changes received by Database Manager are replicated through Global Update Manager to all nodes in the cluster.
Database Manager functions used by other components
Some Database Manager functions are exposed through the cluster API. The primary purpose for exposing Database Manager functions is to allow custom resource DLLs to save private properties to the cluster database when this is useful for a particular clustered application. (A private property for a resource is a property that applies to that resource type but not other resource types; for example, the SubnetMask property applies for an IP Address resource but not for other resource types.) Database Manager functions are also used to query the cluster database.
Node Manager
Node Manager runs on each node and maintains a local list of nodes, networks, and network interfaces in the cluster. Through regular communication between nodes, Node Manager ensures that all nodes in the cluster have the same list of functional nodes.
Node Manager uses the information in the cluster configuration database to determine which nodes have been added to the cluster or evicted from the cluster. Each instance of Node Manager also monitors the other nodes to detect node failure. It does this by sending and receiving messages, called heartbeats, to each node on every available network. If one node detects a communication failure with another node, it broadcasts a message to the entire cluster, causing all nodes that receive the message to verify their list of functional nodes in the cluster. This is called a regroup event.
Node Manager also contributes to the process of a node joining a cluster. At that time, on the node that is joining, Node Manager establishes authenticated communication (authenticated RPC bindings) between itself and the Node Manager component on each of the currently active nodes.
Note
-
A down node is different from a node that has been evicted from the cluster. When you evict a node from the cluster, it is removed from Node Manager’s list of potential cluster nodes. A down node remains on the list of potential cluster nodes even while it is down; when the node and the network it requires are functioning again, the node joins the cluster. An evicted node, however, can become part of the cluster only after you use Cluster Administrator or Cluster.exe to add the node back to the cluster.
Membership Manager
Membership Manager (also called the Regroup Engine) causes a regroup event whenever another node’s heartbeat is interrupted (indicating a possible node failure). During a node failure and regroup event, Membership Manager and Node Manager work together to ensure that all functioning nodes agree on which nodes are functioning and which are not.
Cluster Network Driver
Node Manager and other components make use of the Cluster Network Driver, which supports specific types of network communication needed in a cluster. The Cluster Network Driver runs in kernel mode and provides support for a variety of functions, especially heartbeats and fault-tolerant communication between nodes.
Failover Manager and Resource Monitors
Failover Manager manages resources and resource groups. For example, Failover Manager stops and starts resources, manages resource dependencies, and initiates failover of resource groups. To perform these actions, it receives resource and system state information from cluster components on the node and from Resource Monitors. Resource Monitors provide the execution environment for resource DLLs and support communication between resources DLLs and Failover Manager.
Failover Manager determines which node in the cluster should own each resource group. If it is necessary to fail over a resource group, the instances of Failover Manager on each node in the cluster work together to reassign ownership of the resource group.
Depending on how the resource group is configured, Failover Manager can restart a failing resource locally or can take the failing resource offline along with its dependent resources, and then initiate failover.
Global Update Manager
Global Update Manager makes sure that when changes are copied to each of the nodes, the following takes place:
-
Changes are made atomically, that is, either all healthy nodes are updated, or none are updated.
-
Changes are made in the order they occurred, regardless of the origin of the change. The process of making changes is coordinated between nodes so that even if two different changes are made at the same time on different nodes, when the changes are replicated they are put in a particular order and made in that order on all nodes.
Global Update Manager is used by internal cluster components, such as Failover Manager, Node Manager, or Database Manager, to carry out the replication of changes to each node. Global updates are typically initiated as a result of a Cluster API call. When an update is initiated by a node, another node is designated to monitor the update and make sure that it happens on all nodes. If that node cannot make the update locally, it notifies the node that tried to initiate the update, and changes are not made anywhere (unless the operation is attempted again). If the node that is designated to monitor the update can make the update locally, but then another node cannot be updated, the node that cannot be updated is removed from the list of functional nodes, and the change is made on available nodes. If this happens, quorum logging is enabled at the same time, which ensures that the failed node receives all necessary configuration information when it is functioning again, even if the original set of nodes is down at that time.
Diagram and Description of Checkpoint Manager
Some applications store configuration information locally instead of or in addition to storing information in the cluster configuration database. Applications might store information locally in two ways. One way is to store configuration information in the registry on the local server; another way is to use cryptographic keys on the local server. If an application requires that locally-stored information be available on failover, Checkpoint Manager provides support by maintaining a current copy of the local information on the quorum resource.
The following figure shows the Checkpoint Manager process.
Checkpoint Manager
Checkpoint Manager handles application-specific configuration data that is stored in the registry on the local server somewhat differently from configuration data stored using cryptographic keys on the local server. The difference is as follows:
-
For applications that store configuration data in the registry on the local server, Checkpoint Manager monitors the data while the application is online. When changes occur, Checkpoint Manager updates the quorum resource with the current configuration data.
-
For applications that use cryptographic keys on the local server, Checkpoint Manager copies the cryptographic container to the quorum resource only once, when you configure the checkpoint. If changes are made to the cryptographic container, the checkpoint must be removed and re-associated with the resource.
Before a resource configured to use checkpointing is brought online (for example, for failover), Checkpoint Manager brings the locally-stored application data up-to-date from the quorum resource. This helps make sure that the Cluster service can recreate the appropriate application environment before bringing the application online on any node.
Note
-
When configuring a Generic Application resource or Generic Service resource, you specify the application-specific configuration data that Checkpoint Manager monitors and copies. When determining which configuration information must be marked for checkpointing, focus on the information that must be available when the application starts.
Checkpoint Manager also supports resources that have application-specific registry trees (not just individual keys) that exist on the cluster node where the resource comes online. Checkpoint Manager watches for changes made to these registry trees when the resource is online (not when it is offline). When the resource is online and Checkpoint Manager detects that changes have been made, it creates a copy of the registry tree on the owner node of the resource and then sends a message to the owner node of the quorum resource, telling it to copy the file to the quorum resource. Checkpoint Manager performs this function in batches so that frequent changes to registry trees do not place too heavy a load on the Cluster service.
Diagram and Description of Log Manager (for Quorum Logging)
The following figure shows how Log Manager works with other components when quorum logging is enabled (when a node is down).
Log Manager and Other Components Supporting Quorum Logging
When a node is down, quorum logging is enabled, which means Log Manager receives configuration changes collected by other components (such as Database Manager) and logs the changes to the quorum resource. The configuration changes logged on the quorum resource are then available if the entire cluster goes down and must be formed again. On the first node coming online after the entire cluster goes down, Log Manager works with Database Manager to make sure that the local copy of the configuration database is updated with information from the quorum resource. This is also true in a cluster forming for the first time — on the first node, Log Manager works with Database Manager to make sure that the local copy of the configuration database is the same as the information from the quorum resource.
Diagram and Description of Event Log Replication Manager
Event Log Replication Manager, part of the Cluster service, works with the operating system’s Event Log service to copy event log entries to all cluster nodes. These events are marked to show which node the event occurred on.
The following figure shows how Event Log Replication Manager copies event log entries to other cluster nodes.
How Event Log Entries Are Copied from One Node to Another
The following interfaces and protocols are used together to queue, send, and receive events at the nodes:
-
The Cluster API
-
Local remote procedure calls (LRPC)
-
Remote procedure calls (RPC)
-
A private API in the Event Log service
Events that are logged on one node are queued, consolidated, and sent through Event Log Replication Manager, which broadcasts them to the other active nodes. If few events are logged over a period of time, each event might be broadcast individually, but if many are logged in a short period of time, they are batched together before broadcast. Events are labeled to show which node they occurred on. Each of the other nodes receives the events and records them in the local log. Replication of events is not guaranteed by Event Log Replication Manager — if a problem prevents an event from being copied, Event Log Replication Manager does not obtain notification of the problem and does not copy the event again.
Diagram and Description of Backup and Restore Capabilities in Failover Manager
The Backup and Restore capabilities in Failover Manager coordinate with other Cluster service components when a cluster node is backed up or restored, so that cluster configuration information from the quorum resource, and not just information from the local node, is included in the backup. The following figure shows how the Backup and Restore capabilities in Failover Manager work to ensure that important cluster configuration information is captured during a backup.
Backup Request on a Node That Does Not Own the Quorum Resource
The following table lists files that are in the cluster directory (systemroot\cluster, where systemroot is the root directory of the server’s operating system).
Cluster Service Files in
Systemroot
\Cluster
|
File
|
Description
|
|
Cladmwiz.dll
|
Cluster Administrator Wizard
|
|
Clcfgsrv.dll
|
DLL file for Add Nodes Wizard and New Server Cluster Wizard
|
|
Clcfgsrv.inf
|
Setup information file for Add Nodes Wizard and New Server Cluster Wizard
|
|
Clnetres.dll
|
Resource DLL for the DHCP and WINS services
|
|
Clnetrex.dll
|
Extension DLL for the DHCP and WINS services
|
|
Cluadmex.dll
|
Extension DLL for core resource types
|
|
Cluadmin.exe
|
Cluster Administrator
|
|
Cluadmmc.dll
|
Cluster Administrator MMC extension
|
|
Clusres.dll
|
Cluster resource DLL for core resource types
|
|
Clussvc.exe
|
Cluster service
|
|
Debugex.dll
|
Cluster Administrator debug extension
|
|
Mqclus.dll
|
Resource DLL for Message Queuing
|
|
Resrcmon.exe
|
Cluster Resource Monitor
|
|
Vsstask.dll
|
Resource DLL for Volume Shadow Copy Service Task
|
|
Vsstskex.dll
|
Extension DLL for Volume Shadow Copy Service Task
|
|
Wshclus.dll
|
Winsock helper for the Cluster Network Driver
|
The following table lists log files for server clusters.
Log Files for Server Clusters
|
Log File
|
Folder Location
|
Description
|
|
cluster.log (default name)
|
systemroot\Cluster
|
Records the activity of the Cluster service, Resource Monitor, and resource DLLs on that node. The default name of this log can be changed by changing the System environment variable called ClusterLog.
|
|
cluster.oml
|
systemroot\Cluster
|
Records the creation and deletion of cluster objects and other activities of the Object Manager of the cluster; useful for a developer writing a tool for analyzing the translation of GUIDs to friendly names in the cluster.
|
|
clcfgsrv.log
|
systemroot\system32\LogFiles\Cluster
|
Records activity of Cluster configuration wizards; useful for troubleshooting problems during cluster setup.
|
|
clusocm.log
|
systemroot\system32\LogFiles\Cluster
|
Records cluster-related activity that occurs during an operating system upgrade.
|
|
cluscomp.log
|
systemroot\system32\LogFiles\Cluster
|
Records the activity that occurs during the compatibility check at the start of an operating system upgrade on a cluster node.
|
The following table lists files that are in systemroot\system32, systemroot\inf, or subfolders in systemroot\system32.
Additional Cluster Service Files
|
File
|
Folder
|
Description
|
|
clusapi.dll
|
systemroot\system32
|
Server Cluster API
|
|
clusocm.dll
|
systemroot\system32\Setup
|
Cluster extension for the Optional Component Manager
|
|
clusocm.inf
|
systemroot\inf
|
Cluster INF file for the Optional Component Manager
|
|
clussprt.dll
|
systemroot\system32
|
A DLL that enables the Cluster service on one node to send notice of local cluster events to the Event Log service on other nodes
|
|
cluster.exe
|
systemroot\system32
|
Cluster command-line interface
|
|
msclus.dll
|
systemroot\system32
|
Cluster Automation Server
|
|
Resutils.dll
|
systemroot\system32
|
Utility routines used by resource DLLs
|
|
Clusnet.sys
|
systemroot\system32\drivers
|
Cluster Network Driver
|
|
Clusdisk.sys
|
systemroot\system32\drivers
|
Cluster Disk Driver
|
The following table lists files that have to do with the quorum resource and (for a single quorum device cluster, the most common type of cluster) are usually in the directory q:\mscs, where q is the quorum disk drive letter and mscs is the name of the directory.
Files Related to the Quorum Resource
|
File
|
Description
|
|
Quolog.log
|
The quorum log, which contains records of cluster actions that involve changes to the cluster configuration database.
|
|
Chk*.tmp
|
Copies of the cluster configuration database (also known as checkpoints). Only the latest one is needed.
|
|
{GUID}
|
Directory for each resource that requires checkpointing; the resource GUID is the name of the directory.
|
|
{GUID}\*.cpt
|
Resource registry subkey checkpoint files.
|
|
{GUID}\*.cpr
|
Resource cryptographic key checkpoint files.
|