MS Exchange 2000 Clustering

Archived content. No warranty is made as to technical accuracy. Content may contain URLs that were valid when originally published, but now link to sites or pages that no longer exist.

Updated : June 14, 2001

Mark Wistrom

Microsoft Corporation

May 2000

Summary: Presents an overview of the Microsoft Exchange 2000 Cluster service and helps administrators and developers understand how Exchange 2000 implements Active/Active clustering. (12 printed pages)

Contents

Introduction

Overview
Exchange 2000 Cluster Awareness
Setting Up Windows 2000 Clustering
Exchange Virtual Servers
Exres.dll
Exchange Resources
Architecture
Performance
Conclusion

On This Page

Introduction

Introduction

E-mail has become a mission-critical application. Businesses rely on e-mail for round the clock communications with both internal and external customers. If e-mail is unavailable, money is lost.

Maintaining high availability in any e-mail system can only be managed at the enterprise level. While there is no single feature that will provide 100 percent uptime, the robust and stable Microsoft® Exchange 2000 platform provides new clustering functionality that will reduce both planned and unplanned downtime.

Clustering is one of the tools that experienced administrators can use to attain round-the-clock availability. However, while it makes data easily available, it does not protect the data itself. This document provides information that will help administrators and developers understand how Exchange 2000 implements Active/Active clustering and how it works, as well as how it affects their systems. This paper is intended as an overview and no performance data or recommended configurations are given.

Overview

It is important to have a firm understanding of the Microsoft Windows® 2000 Cluster service because it works directly with Exchange 2000. There are many documents available for review in the MSDN Library that discuss the Windows 2000 Cluster service. The rest of this paper assumes that you are familiar with Windows 2000 Clustering. Listed below are a few of the topics addressed.

  • Exchange Cluster Awareness

  • Windows Clustering Setup

  • Exchange Virtual Servers

  • Exchange Resources

  • Architecture

  • Performance

The main features of the Windows 2000 Cluster service, with respect to Exchange 2000, are:

  • Shared nothing architecture. Because Windows 2000 Cluster service does not allow dynamic load balancing, neither does Exchange 2000. Although all nodes in the cluster can access the shared data, it takes seconds to exchange control of the shared data. No one can access the shared data during that time.

  • Resource DLL. Windows 2000 communicates with resources in a cluster through a resource DLL. Exchange 2000 provides its own custom resource DLL, named exres.dll. Communication between the Windows 2000 Cluster service and Exchange 2000 has been enhanced and customized to provide all Cluster service functionality.

  • Resources. Exchange 2000 provides its own resources for use in clustering. It also uses some resources provided by Windows 2000, such as the disk, the IP address, and the Network Name resource. Each resource has properties that the Windows 2000 Cluster service uses to manage it. These properties affect the Exchange 2000 resources as well, but Exchange 2000 does not own them.

  • Cluster groups. Exchange 2000 uses the cluster groups as Exchange Virtual Servers (EVS). Understanding how Windows 2000 uses cluster groups and their properties will help the user understand how Exchange 2000 works in a clustered environment.

All the restrictions and recommendations that apply to Windows 2000 clusters also apply to Exchange 2000 when running on a Windows 2000 cluster. These restrictions and recommendations include hardware compatibility and deployment guides.

Exchange 2000 Cluster Awareness

Exchange 2000 is fully cluster-aware. Many changes were made to Exchange 2000 so that multiple EVSs could run on the same node (Active/Active). The major changes are:

  • Exchange 2000 now allows multiple storage groups and protocol virtual servers to exist on a single server. Exchange 2000 clustering uses these features to implement Active/Active clustering as discussed later.

  • Exchange 2000 has its own resource DLL, exres.dll.

  • All Exchange components must be aware that the server network name and the node network name are different, because the machine name that runs the EVS will change as the EVS is failed over to other nodes.

The first two changes are discussed throughout the rest of the paper. The third change is a general requirement of every component that runs on a cluster and is not addressed here. Nevertheless, Exchange 2000 on a cluster is very similar to Exchange 2000 as a stand-alone server. After you understand how Exchange 2000 works in a cluster, many of the concepts that apply to a stand-alone server can easily be understood in a cluster.

Setting Up Windows 2000 Clustering

The Windows 2000 Cluster service must be set up and functional before Exchange 2000 is installed on the cluster. The account with which the cluster was created must have administrator rights on all nodes and be an account in the domain. This account must also be a member of the delegated Exchange administrators group. The Exchange 2000 Installation Wizard generates a message to inform the user that it is being installed on a cluster.

Cc767158.clust_clust01(en-us,TechNet.10).gif

Figure 1: Exchange 2000 Installation Wizard dialog box

Setup then places the Exchange 2000 binaries on the local drive of the node but does not create shared data directories. Setup also creates the Exchange-specific cluster resource types. The node must be rebooted after setup is complete.

The Exchange Virtual Servers, discussed in the next section, are created in the cluster administrator after rebooting. In the cluster administrator, select a cluster group and create the IP address, Network Name resource, and Disk resource (for the shared data) if they are not already there. Next, create the System Attendant resource, which is dependent on the Disk and Network Name resources. After the System Attendant resource is created, Exchange creates the rest of the required resources. The cluster group is now an Exchange Virtual Server.

Exchange Virtual Servers

Exchange Virtual Servers (EVS) are an important concept in an Exchange 2000 cluster. An EVS acts as a stand-alone server. Clients connect to the EVS just as they do to a stand-alone server. An EVS is a cluster group that has the following four elements.

  • Disk resources on the shared storage

  • A static IP address for the EVS

  • A Network Name resource for the EVS

  • Exchange 2000 resources

    Cc767158.clust_clust02(en-us,TechNet.10).gif

    Figure 2: Cluster Administrator

User data such as the private and public databases and log files, Simple Mail Transport Protocol (SMTP) queues, content indexing databases, and message tracking logs go on the disk resource for that EVS. If the storage groups for the EVS are configured so that the logs are on one set of drives and the databases are on another, then all drives used must be in the EVS. This data must go on the shared storage so that if an EVS is moved to another node, the EVS can still access the data. The static IP address and the Network Name resource are Windows 2000 resources and are used by the clients to connect to the EVS. The Network Name resource is the name of the EVS to which the clients connect.

Because an EVS is a cluster group, all properties such as preferred owner, failover, and failback policies are set on the EVS. The EVS is the basic unit of failover. If a resource fails in the EVS, then the Cluster service tries to restart the resource. If the resource fails multiple times, then the Cluster service moves the entire EVS to another node. The same is true for a planned failover; the finest granularity of failover is the entire EVS.

Exres.dll

Exres.dll is the Exchange-specific resource DLL. The Cluster service communicates through a resource monitor to exres.dll, which in turn communicates with the proper Exchange components. Exres.dll performs actions such as bringing resources online and offline, checking resources with IsAlive calls, and reporting failures.

In a cluster, the Cluster service is responsible for starting and stopping services through exres.dll. The administrator should not stop a service from the command line, because in that case the IsAlive call fails and the Cluster service attempts to bring the service back online.

Excluadm.dll provides cluster-specific wizards and user interface (UI) associated with Exchange.

Cc767158.clust_clust03(en-us,TechNet.10).gif

Figure 3: The Cluster service communicating with exres.dll

Exchange Resources

This section covers in detail the Exchange-specific resources. An Exchange Virtual Server (EVS) is a collection of Exchange resources. Each resource has all of the properties that a Windows 2000 resource has, such as dependencies, possible owners, and retry properties. Each resource in the EVS represents a different component of Exchange. The possible Exchange-specific resources are:

  • System Attendant

  • Information Store

    Protocols

    • SMTP

    • HTTP

    • IMAP

    • POP3

  • Routing

  • Message Tracking Agent

  • MSSearch

For each resource, the IsAlive and LooksAlive calls that the Cluster service makes into that resource are identical.

The resources have dependencies in the EVS as shown in the following figure.

Cc767158.clust04(en-us,TechNet.10).gif

Figure 4: Exchange resources dependencies

System Attendant

The default dependencies shown in the preceding figure are created when the System Attendant is created. The System Attendant is the fundamental resource that controls the creation and deletion of all the resources in the EVS. To create the correct resources, create the System Attendant as described in "Setting Up Windows 2000 Clustering" earlier in this document. To delete the server and its object from Active Directory™, delete the System Attendant. The IsAlive call to the System Attendant checks the Service Control Manager to see if the System Attendant is running.

Information Store

When the Information Store comes online pending, the Information Store service starts and begins to mount the storage groups. When all the storage groups are mounted and the store has played through all transaction logs (if needed), the resource is online. The IsAlive call to the Information Store checks the Service Control Manager to see if the Information Store is running.

Message Transfer Agent

The Message Transfer Agent (MTA) resource is Active/Passive. There can be only one MTA per cluster. The MTA is created in the first Exchange Virtual Server (EVS). If the EVS with the MTA is not the last EVS in the cluster and it is deleted, the MTA will be moved to another EVS in the cluster. Although the MTA is Active/Passive, it will serve all EVSs in the cluster as long as it is online. The IsAlive call to the MTA checks the Service Control Manager to see if the MTA is running.

Protocols (SMTP, POP3, IMAP, HTTP)

The IsAlive call acts in the same way for all of the protocols. Exres.dll makes a call to the protocol and looks for the response banner. If the banner does not return after the time-out period, the Cluster service assumes that the protocol virtual server is unavailable and the IsAlive call fails. None of the protocols can be set to reject all connections from all servers or the protocol virtual server will reject IsAlive calls from itself. Each protocol virtual server must accept connections from its own IP address.

Cc767158.clust05(en-us,TechNet.10).gif

Figure 5: Protocol virtual server

POP3, IMAP, and SMTP use the default protocol virtual servers that are installed by Windows 2000. The HTTP protocol leaves the default protocol virtual server and creates a second protocol virtual server.

When any EVS is brought offline (as in a planned failover), all instances of the SMTP protocol virtual server on the node are brought offline and quickly restarted. The SMTP resource will not restart automatically if the "do not restart" option is selected on the properties page.

Routing

The IsAlive call to the Routing resource checks the Service Control Manager to see if the Routing service is running.

Content Indexing

The MSSearch resource provides content indexing for the EVSs. The IsAlive call to MSSearch returns a pointer to the data structure for the database that it is indexing. If the pointer is valid, then the resource is working correctly. To re-create the MSSearch resource after it has been deleted, you must delete and re-create the Information Store resource for that EVS.

Architecture

This section describes in more detail how Active/Active clustering is implemented. As stated earlier, multiple storage groups and protocol virtual servers are used to make Exchange 2000 Active/Active cluster-aware. Consider the cluster node in the following figure. There is one EVS on the node, and this EVS has one storage group associated with it. The protocols are responsible for the IP address and Network Name of the EVS, not of the node itself. If another EVS in the cluster fails over to this node, then the store.exe process simply mounts the storage groups associated with the new EVS (in this case two storage groups) and creates more protocol virtual servers to respond to the IP address and Network Name resource of the second EVS.

Cc767158.clust06(en-us,TechNet.10).gif

Figure 6: Cluster nodes

A stand-alone server cannot have more than four storage groups mounted and active at one time. Thusly you have to monitor the number of storage groups in a cluster. The same limitation of four storage groups applies to a single node of the cluster. No matter how many EVSs are failed over to a single node, the store.exe can mount no more than four storage groups. The following table lists all of the possible combinations of the number of EVSs in the cluster and the number of storage groups per EVS.

Table 1 Possible combinations of number of EVSs in a cluster and number of storage groups per EVS

Number of EVSs in the cluster

Possible number of storage groups for each EVS

 

 

 

 

EVS1

EVS2

EVS3

EVS4

1

1

 

 

 

 

2

 

 

 

 

3

 

 

 

 

4

 

 

 

2

1

1

 

 

 

2

1

 

 

 

3

1

 

 

 

2

2

 

 

3

1

1

1

 

 

2

1

1

 

4

1

1

1

1

Performance

A clustered Exchange 2000 server acts very much like a stand-alone server with the same number of storage groups and protocol virtual servers. There are a few small differences. First, there are periodic IsAlive calls into the different components to check their status. These calls cause very little overhead. Second, each EVS acts as if it were a stand-alone server, which causes a slight difference in the way that messages are routed between EVSs. All messages routed for a user on one EVS to another EVS are transported by SMTP.

The failover time for an EVS is important. To maintain high availability, the time must be short. There are two different scenarios, planned and unplanned failover.

In the planned case, failover proceeds as follows:

  • The information store unmounts the storage groups and stops the protocol virtual servers.

  • The resources are failed over.

  • The information store on the other node mounts the storage groups, and Exchange brings up the protocols that are responsible for another IP address.

In the unplanned case, failover proceeds as follows:

  • The Cluster service decides that the other node is not available.

  • The node that is alive, mounts the databases that failed over and plays through the transaction log files to synchronize the databases.

  • Exchange brings up the protocols that are responsible for another IP address.

Conclusion

This article discusses the Exchange 2000 Cluster service and gives an in-depth overview that will help administrators and developers understand how Exchange 2000 implements Active/Active clustering, and how it affects their systems. Administrators and developers can use this information to help maintain a robust and stable Exchange 2000 platform that reduces both planned and unplanned downtime.