Exchange Clustering Concepts
Topic Last Modified: 2006-02-09
By Nino Bilic
This article covers some basic concepts about clustering and how clustering relates to Microsoft® Exchange Server. The article's main purpose is to improve your understanding of clustering.
The Cluster service is a Microsoft Windows® service that can be used on certain versions of the Windows operating systems. Clustering is available in Microsoft Windows Server™ 2003, Enterprise Edition, Windows Server 2003, Datacenter Edition, Windows 2000 Advanced Server, and Windows 2000 Datacenter Server operating systems. Windows NT® Server 4.0, Enterprise Edition had clustering support starting with Service Pack 3 (SP3), but this article does not provide details about Windows NT Server 4.0 clustering.
Troubleshooting Exchange Server issues on the cluster server follows the same principles, in general, as troubleshooting non-clustered servers running Exchange Server.
The following sections provide a high-level overview of clustering concepts.
Suppose you have a two node cluster. The same principles will apply for clusters of more than two nodes, but there is increased complexity as the number of nodes increases. So, in this example, start with Node A and Node B.
In the Microsoft implementation of clustering, Node A and Node B both have to be connected to some sort of shared storage. This shared storage must be on a SCSI bus, either as direct connected storage or a Storage Area Network (SAN). In a properly working cluster, only one node can have full access to any one shared disk at a time. So, if Node A owns the shared storage, Node B will not be able to see the same disk. This mode, where you do not share resources between nodes, is called a Shared Nothing clustering model.
Shared disk is essentially a resource for the Cluster service. That resource will be in one of the resource groups.
Resource A resource is the single unit that can be administered or managed on the cluster. Cluster resources include physical hardware devices such as disk drives and network cards, and logical items such as Internet Protocol (IP) addresses, applications, and application databases. Each node in the cluster will have its own local resources, like a stand-alone server. However, the cluster also has common resources, such as a common data storage array and private cluster network. These common resources are accessible by each node in the cluster. A resource can be either online or offline. A resource is online when it is available and providing its service to the cluster.
Resources are physical or logical entities that have the following characteristics:
Can be brought online and taken offline.
Can be managed in a server cluster.
Can be owned by only one node at a time.
Resources can (and sometimes must) be dependant on each other. For example, in the case of the Microsoft Exchange Information Store (MSExchangeIS) resource, the resource will be dependant on the Microsoft Exchange System Attendant (MSExchangeSA) resource. If the MSExchangeSA resource goes offline, the MSExchangeIS resource will go offline too, because MSExchangeIS cannot run without MSExchangeSA. Note that MSExchangeSA is on a non-clustered server. Note also that resources cannot be dependant on resources that are created in a different resource group.
Resource group This is a collection of resources managed by the Cluster service as a single logical unit. Application resources and cluster entities can be easily managed by grouping logically related resources into a resource group. When a Cluster service operation is performed on a resource group, the operation affects all individual resources contained within the group. Typically, a resource group is created to contain all the elements needed by a specific application server and client for successful use of the application.
For example, on cluster servers that run Exchange Server, you will have a Microsoft Exchange resource group that will contain Exchange resources like MSExchangeIS, MSExchangeSA, network name, IP address, and disk. Everything that a stand-alone server would usually have would be in this resource group. Note that the name of the group is not hard coded, so your Exchange group might be called something else, depending on what you name it when it is created. The main point is that the group holds all Exchange resources.
Additionally, note that the group is the smallest unit that can be moved between cluster nodes or can fail over to the other cluster node. By default, the failure of one group resource may affect the whole group. If a resource fails a specific number of times, the whole group will be moved to the other node. The default is four times in 15 minutes.
This is an important point, because moving the resource group to a different cluster node takes time, and clients will have an interruption in service during that time. This is why it is crucial that the resources for different applications have separate resource groups, because you would not want the failure of one resource to affect another resource. For example, if you have a Microsoft SQL Server™ group and an Exchange Server group, the failure of one would not affect the other.
The cluster is administered using a tool called Cluster Administrator (Cluadmin.exe). The following is an example of that interface.
One main function of the Cluster service is to ensure all nodes of the active cluster membership have a consistent view of the configuration database. Because nodes are actual physical computers, they have different installations of the Windows operating systems. This means that they have separate registries. It takes time for the Cluster service to make sure that registries on all nodes of the cluster are synchronized properly, and that all changes are also recorded to the quorum.
On a clustered server, there will be a registry hive called Cluster, located under HKeyLocalMachine.
This is where the Cluster service keeps the cluster configuration information. This specific part of the registry is what is replicated between cluster nodes through a process called Global Update. When a change is made to the cluster configuration, for example when a new resource is created, Cluster service makes sure that the change is replicated to all or none of the nodes in the cluster. That way, all nodes either know about the change, or the change is rolled back if there are problems with updating one of the nodes.
Each cluster also has a special resource called the quorum disk resource.
Quorum disk resource is a resource, usually a physical disk resource, that has been configured to manage the quorum log and cluster database checkpoints, which comprise the configuration data necessary for recovery of the cluster.
When a change in cluster configuration is made, the Cluster service makes sure to let all other nodes in the cluster know about the change, but it also makes sure that the current owner of the quorum disk resource updates the quorum log on the quorum drive. This log is a set of transactions that shows different changes in cluster configuration. This is essential for the existence of the cluster. If the quorum drive cannot be accessed or is damaged, the Cluster service will not start.
The following graphic shows how this would look in a two node cluster. There will be one quorum disk resource per cluster and only one node would have access to it at a time. That node, the quorum disk resource owner at that time, is the one that would be responsible for updating the quorum log on the quorum drive.
Updates to cluster configuration get replicated between nodes through the process called Global Update. But what if you have an application that does not write its configuration into the cluster registry? For example, Exchange Server and SQL Server write their information to different places, under places such as HKLM\System\CurrentControlSet\Services. There has to be a mechanism that will replicate any changes made under those keys, for example, the level of diagnostics logging on Exchange components. That is where checkpoints are used. When the resource comes online on another node, for example Node B, it must have the same registry information to work with as it did on the previous node. The Checkpoint Manager component of the Cluster service makes this happen.
Checkpoints are written on the quorum drive in the MSCS folder. There will be a folder named by the resource GUID. Also, because every resource has an entry in the cluster registry, you will be able to see exactly what registry paths are being replicated. For example, the following shows the registry keys that will be replicated for a demonstration MSExchangeSA resource. Note that registry paths are listed in the right pane of the Registry Editor window.
When Exchange Server is installed on the cluster server, Setup will by default only copy the Exchange Server binaries to the hard drive and make some modifications in the cluster registry. Even after you install Exchange Server to all nodes in the cluster, you will still not see any changes in Exchange System Manager that you would see on a non-clustered server. Non-clustered Setup creates the Exchange server object in the Active Directory® directory service, but clustered Setup does not.
When Exchange Server is installed on all nodes, the MSExchangeSA resource must be created manually in the Exchange resource group. When the MSExchangeSA resource is created, it will automatically create all other Exchange resources for you, such as MSExchangeIS, message transfer agent (MTA), and HTTP virtual server. That is when the server object will be added to Active Directory in the configuration partition of Active Directory, so you will be able to see the server object from Exchange System Manager after this point.
During the MSExchangeSA resource setup, one of the steps is to provide the location for Exchange databases. It is vital to provide the path to the shared disk resource in this step.
All those Exchange resources that now have been created are usually called the Exchange Virtual Server (EVS). The following example shows how that will look in Cluster Administrator.
Now Microsoft Office Outlook® 2003 clients should actually connect to the name of the EVS, rather than the name of an individual cluster node. The EVS name can be found if you check the Properties of the Network Name resource in the Exchange Group. If clients connect to the EVS name, no matter which cluster node currently owns the Exchange Group, the name that clients access is always the same. That is the point of clustering in Exchange Server. In the case of a cluster node failure, one of other cluster nodes will take ownership of EVS, and therefore clients will be able to connect to the same Exchange server. If clients were connecting to Node A instead of Exchange Server, for example, and then Node A fails, Exchange services for those clients would be unavailable.
The following illustration shows the concept of virtual server and how it relates to actual cluster nodes. Users are opening their Outlook clients and over the network connecting to the Exchange virtual server that is currently owned by Node A. Node A is the node that has control over the shared disk that contains Exchange 2000 Server databases.
For example, if Node A has a hardware failure, Node B realizes that Node A has failed, and then takes ownership of the Exchange virtual server and all associated resources. Node B takes the ownership of network name, IP address, disk, system attendant, and all other Exchange resources. Users that are logging on with their Outlook clients do not know about the failure, because they are connecting to EVS and not a particular physical server. They do not care which node currently owns the virtual server.
Consider the following questions and answers:
Question What is Active/Active and Active/Passive?
Answer Active/Active means that there are two or more Exchange virtual servers running on a two node cluster. Both nodes can run one or more virtual servers at the same time. Active/Passive means that there is only one Exchange virtual server and one of the nodes will not run Exchange when the other node does. Note that Active/Active clustering has a number of issues. As a result, we do not recommend it as a scalable solution. For more information, see the For More Information section.
Question How do you start or stop Exchange Server components on the cluster?
Answer When administering clustered Exchange Server, always use the Cluster Administrator rather than the Services program, unless specifically instructed to do otherwise by a specific Microsoft Knowledge Base article. Using the Services program instead of Cluster Administrator can cause unpredictable results.
Troubleshooting Exchange Server on a cluster is in most cases the same as troubleshooting Exchange Server on a non-clustered server. Any issues that do not directly involve manipulating Exchange resources can be resolved the same way on the cluster as on the non-cluster.
As a general rule, if Exchange Server resources are having a problem, for example, MSExchangeSA or MSExchangeIS are not coming online, the issue most likely belongs to Exchange Server. If one of the core cluster resources is not coming online, for example, a network name or disk, the problem should be dealt with as a Windows platform cluster issue. Failover problems usually follow the same guideline.
Using the cluster log to determine where the problem is will definitely help. For more information, see the For More Information section.
For more information, see the following Exchange resources and Microsoft Knowledge Base article: