Planning Considerations for Clustering

 

The following considerations are important when planning for Exchange 2003 clusters. These considerations apply to Exchange 2003 clusters on Windows Server 2003, Enterprise Edition; Windows Server 2003, Datacenter Edition; Windows 2000 Advanced Server, and Windows 2000 Datacenter Server:

  • Dedicating computers to Exchange

  • Cluster storage solutions

  • Performance and scalability considerations

  • Cluster hardware compatibility

  • Geographically dispersed clustering

  • Disaster recovery strategies for clusters

The following sections discuss these considerations in detail.

Dedicating Computers to Exchange

In addition to Exchange 2003, your server clusters can run other applications. However, if you run multiple applications on the same node, the performance of your Exchange Virtual Servers can be affected. When deciding whether to dedicate computers to only Exchange, consider the following:

  • If you use a cluster for more than one application, consider dedicating a node for each application and make sure that enough passive nodes are available.

  • If you use clusters to provide Exchange services to your users, it is recommended that you run only Exchange 2003 on your clusters and run other applications on separate hardware.

  • For best results, an EVS should not fail over to an active node that runs another application.

  • Exchange 2003 cluster nodes must be member servers of a domain. Exchange 2003 clusters do not support cluster nodes as domain controllers or global catalog servers.

For more information about the performance of Exchange 2003 clusters, see "Managing Exchange Clusters" in the Exchange Server 2003 Administration Guide.

Cluster Storage Solutions

A detailed discussion about selecting a cluster storage solution is beyond the scope of this guide. However, this section provides general recommendations and strategies for implementing a cluster storage solution.

Most of the best practices that apply to stand-alone (non-clustered) servers apply to clustered servers as well (for example, RAID solutions and SAN solutions). For detailed information about Exchange storage solutions, see Planning a Reliable Back-End Storage Solution.

For detailed information about selecting a cluster storage method in Windows Server 2003, see Choosing a Cluster Storage Method.

Separate Hard Disks for Log Files

If the storage groups for an EVS are configured so that the log files are on one set of physical drives and the databases on another, all of the drives must be configured as disk resources within the same EVS. Specifically, all of the data must be on a shared disk, and all of the physical disk resources must be part of the Exchange cluster group. This allows the log files and the storage group databases to fail over to another node if the EVS goes offline.

Note

The System Attendant should be made dependent on all physical disk resources (drives and volume mount points) that contain Exchange data. This ensures that the System Attendant resource can properly access Exchange data on the physical disk resources of the EVS. If the System Attendant is not dependent on these resources, Exchange resources may start before they have access to read data on the physical disk resources. This can cause the following Exchange database error: -1022 Jet_errDiskIO. For information about the -1022 Exchange database error, see Microsoft Knowledge Base article 314917, "Understanding and analyzing -1018, -1019, and -1022 Exchange database errors."

Storage Group Limitations

Exchange 2003 is limited to four storage groups per server. This is a physical limitation and applies to each node of a cluster as well. This limitation may create problems with active/active configurations but does not affect active/passive configurations.

Note

Active/passive clustering is the strongly recommended configuration for Exchange 2003. For information about why active/passive clustering is recommended, see "Cluster Configurations" in Understanding Exchange Server 2003 Clustering.

To help explain why this storage group limitation affects only active/active clusters, consider a two-node active/active cluster, where one node contains two storage groups and the other node contains three storage groups.

Two-node active/active cluster configuration five storage groups

Exchange Virtual Server State Storage group names

Node 1 Exchange Virtual Server (EVS)1

Active

storage group 1, storage group 2, storage group 3

Node 2 EVS2

Active

storage group 1, storage group 2

In this table, the Exchange cluster includes a total of five storage groups. If EVS2 on Node 2 fails over to Node 1, Node 1 cannot mount both storage groups because it will have exceeded the four-storage-group limitation. As a result, EVS2 does not come online on Node 1. If Node 2 is still available, EVS2 fails over back to Node 2.

Note

For backup and recovery purposes, Exchange 2003 does support an additional storage group, called the recovery storage group. However, the recovery storage group cannot be used for cluster node failover purposes. For more information about recovery storage groups, see "New Recovery Features for Exchange 2003" in the Exchange Server 2003 Disaster Recovery Planning Guide.

Drive Letter Limitations

Before you deploy your Exchange 2003 cluster, make sure you have considered the Windows limitation of 26 drive letters per server. If you plan to configure the majority of the server disks as shared cluster resources, the 26 drive letter limitation applies to the entire cluster, not just to each individual node. Regardless of the number of cluster nodes, the maximum number of shared disks is typically 22. The reason the maximum number of shared disks is 22 and not 26 is because one disk must be reserved for the system disk on each node, and two additional disks are typically assigned for the floppy disk and CD (or DVD) drives.

Note

If your cluster nodes are running Windows Server 2003, Enterprise Edition or Windows Server 2003, Datacenter Edition, you can use volume mount points to avoid the 26 drive letter limitation. For more information, see "Windows Server 2003 Volume Mount Points" later in this topic.

It is recommended that you use one drive letter for the databases and one for the log files of each storage group. In a four-node cluster with three EVSs, you can have up to 12 storage groups. Therefore, more than 22 drive letters may be needed for a four-node cluster.

The following sections provide information about planning your cluster storage solution, depending on whether your operating system is Windows Server 2003 or Windows 2000.

Understanding Windows 2000 Drive Letter Limitations

For certain four-node cluster configurations running Windows 2000 Datacenter Server, you may need to disable one or more drives to make room for more shared disks in the cluster. For example, you may want to disable the CD-ROM or DVD-ROM drives on your servers. Maximizing the number of shared disks can reduce your ability to map drives for network share access.

Note

Because Windows 2000 does not support the use of volume mount points (a form of logical disk), you cannot use volume mount points for your Exchange shared disks with Windows 2000. However, you can use volume mount points for local drives (for example, CD-ROM or DVD drives).

This drive letter limitation is also a limiting factor in how you design storage group and database architecture for an Exchange cluster. The following sections provide examples of how you can maximize data reliability on your cluster when using Windows Server 2003.

Disk Configuration with Three Storage Groups

The configuration shown in the following table is reliable—each storage group (storage group 1, storage group 2, and storage group 3) has a dedicated drive for its databases and a dedicated drive for its log files. An additional disk is used for the EVS SMTP queue directory. However, with this design, you are limited to three storage groups per EVS.

3-active/1-passive cluster architecture with three EVSs, each with three storage groups

Node 1 (EVS1 active) Node 2 (EVS2 active) Node 3 (EVS3 active) Node 4 (passive)

Disk 1: SMTP/MTA

Disk 8: SMTP

Disk 15: SMTP

Disk 22: Quorum

Disk 2: storage group 1 databases

Disk 9: storage group 1 databases

Disk 16: storage group 1 databases

 

Disk 3: storage group 1 logs

Disk 10: storage group 1 logs

Disk 17: storage group 1 logs

 

Disk 4: storage group 2 databases

Disk 11: storage group 2 databases

Disk 18: storage group 2 databases

 

Disk 5: storage group 2 logs

Disk 12: storage group 2 logs

Disk 19: storage group 2 logs

 

Disk 6: storage group 3 databases

Disk 13: storage group 3 databases

Disk 20: storage group 3 databases

 

Disk 7: storage group 3 logs

Disk 14: storage group 3 logs

Disk 21: storage group 3 logs

 

Disk Configuration with Four Storage Groups

The configuration shown in the following table adds an additional storage group. However, to stay within the 22-disk limit, the databases of each of the four storage groups per EVS (storage group 1, storage group 2, storage group 3, and storage group 4) are combined across two disks. The database files (.edb and .stm) of storage group 1 and storage group 2 share a common disk volume, and the database files of storage group 3 and storage group 4 share a common disk volume. The benefit of this configuration is that you can use all four storage groups in a four-node cluster. The disadvantage is that the volumes that house the shared storage group databases may need to be large. As a result, if a database disk fails, two storage groups are affected instead of one.

3-active/1-passive cluster architecture with three EVSs, each with four storage groups

Node 1 (EVS1 active) Node 2 (EVS2 active) Node 3 (EVS3 active) Node 4 (passive)

Disk 1: SMTP/MTA

Disk 8: SMTP

Disk 15: SMTP

Disk 22: Quorum

Disk 2: storage group 1 and storage group 2 databases

Disk 9: storage group 1 and storage group 2 databases

Disk 16: storage group 1 and storage group 2 databases

 

Disk 3: storage group 1 logs

Disk 10: storage group 1 logs

Disk 17: storage group 1 logs

 

Disk 4: storage group 1 logs

Disk 11: storage group 2 logs

Disk 18: storage group 2 logs

 

Disk 5: storage group 3 and storage group 4 databases

Disk 12: storage group 3 and storage group 4 databases

Disk 19: storage group 3 and storage group 4 databases

 

Disk 6: storage group 3 logs

Disk 13: storage group 3 logs

Disk 20: storage group 3 logs

 

Disk 7: storage group 4 logs

Disk 14: storage group 4 logs

Disk 21: storage group 4 logs

 

Windows Server 2003 Volume Mount Points

Volume mount points are now supported on shared disks when your cluster nodes (four nodes or more) are running Windows Server 2003, Enterprise Edition or Windows Server 2003, Datacenter Edition. Volume mount points (also known as NTFS junction points or mounted drives) are directories that point to specified disk volumes in a persistent manner. (For example, you can configure C:\Data to point to a disk volume.) Volume mount points bypass the need to associate each disk volume with a drive letter, thereby surpassing the 26-drive letter limitation.

Mount points are useful for large Exchange clusters (for example, four-node or eight-node clusters) that cannot provide a sufficient number of drive letters to deliver the best performance and reliability. For information about how mount points can be used to reduce the number of drive letters, see Using Clustering with Exchange 2003: An Example.

When installing volume mount points in clusters, consider the following:

  • Make sure that you create unique volume mount points so that they do not conflict with existing local drives on any cluster node.

  • Do not create volume mount points between disks on the cluster storage device (cluster disks) and local disks.

  • Do not create volume mount points from the cluster disk that contains the quorum disk resource. You can, however, create a volume mount point from the quorum disk resource to a clustered disk.

  • Volume mount points from one cluster disk to another must be in the same cluster resource group and must be dependent on the root disk. Specifically, the volume mount point disk will not come online unless the root disk is first online. Setting this dependency prevents time-outs and failures when starting.

It is recommended that you use volume mount points with Exchange 2003 clusters that have four or more nodes. You should use one root disk per storage group. You can place the logs on the root disk and place the database on the mounted drive. If there are not enough drive letters available (such as in an 8-node cluster), you can use a single root disk. However, in case of disk failure, to minimize the risk of data loss, do not store data on the root disk. You need one root disk for each EVS.

For more information about support for mount points, see Microsoft Knowledge Base article 318458, "Volume Mount Point Support for an Exchange Server 2003 Cluster on a Windows Server 2003-based System."

For more information about adding a volume mount point to an EVS, see the following resources:

Performance and Scalability Considerations

This section discusses the following performance and scalability aspects of server clustering:

  • Sizing active/passive clusters

  • Sizing active/active clusters

  • Scaling up or scaling out

  • Testing clustered server components

Important

Similar to the effect that virtual memory fragmentation has on stand-alone (non-clustered) servers, Exchange cluster nodes (especially active/active cluster nodes) are also affected by virtual memory fragmentation. For tuning and monitoring information that can help you manage virtual memory fragmentation in a cluster, see "Managing Exchange Clusters" in the Exchange Server 2003 Administration Guide.

For more information about performance and scalability in Exchange 2003, see the Exchange Server 2003 Performance and Scalability Guide.

Sizing Active/Passive Clusters

Just as you would for stand-alone servers, you need to size your active/passive clusters.

Note

Before deploying your clustered servers, it is recommended that you test your sizing metrics in a laboratory environment. To perform these tests, you can use Exchange tools such as Exchange Server Load Simulator 2003 (LoadSim) and Jetstress. For information about the importance of laboratory testing and pilot deployments, see "Laboratory Testing and Pilot Deployments" in System-Level Fault Tolerant Measures.

Sizing Active/Active Clusters

Active/active clusters are not a recommended configuration for Exchange clusters. However, if you decide to implement active/active clustering, remember that Exchange supports only two-node active/active clusters. Also, with active/active clusters, there are two important constraints to consider:

  • The number of concurrent user connections per node cannot exceed 1,900. If you have more than one EVS per node, make sure that the sum of all concurrent MAPI user connections is less than 1,900.

  • The average CPU load per server cannot exceed 40 percent.

If these requirements are not met, your users may notice a significant decrease in performance after a failover. In addition, there is the risk that a single node of the active/active cluster may not be able to bring the second EVS online.

Note

Before deploying your clustered servers, it is recommended that you test your sizing metrics in a laboratory environment. To perform these tests, you can use Exchange tools such as Exchange Server Load Simulator 2003 (LoadSim) and Jetstress. For information about the importance of laboratory testing and pilot deployments, see "Laboratory Testing and Pilot Deployments" in System-Level Fault Tolerant Measures.

Monitoring Considerations for Active/Active Clusters

After you deploy your active/active cluster, you must do the following:

  • Monitor the CPU load for each cluster node.

  • Monitor the number of concurrent connections (users) per node.

Note

Consider monitoring these values during peak e-mail usage intervals. That way, if a failover is required during a peak e-mail period, you will know if the single node can run both EVSs. Also, you can monitor a counter manually in real time, or you can use it to compile a report during a specified period (for example, during a two-hour peak e-mail interval).

Monitoring CPU Loads for Each Cluster Node

If the CPU load exceeds 40 percent (load generated from users) for more than 10 minutes, move mailboxes off the server. This load does not include administrative load increases (for example, moving users).

To monitor the CPU load for each node in the active/active cluster, use the following Performance Monitor (Perfmon) counter:

Performance/%Processor time/_Total counter

Note

Do not be concerned with spikes in CPU performance. Normally, a server's CPU load will spike beyond 80 or even 90 percent.

Monitoring Concurrent Connections (Users) per Node

If the number of concurrent users per node exceeds 1,900 for more than 10 minutes, move mailboxes off the EVS. Although you can meet this requirement by locating only 1,900 mailboxes on each EVS in your active/active cluster, it is generally recommended that you monitor the number of concurrent MAPI users per server. One reason to monitor this is because some users may be making multiple connections to their mailboxes.

To monitor the number of concurrent users per node, use one or both of the following Perfmon counters:

  • MSExchangeIS/Active Connection Count

  • MSExchangeIS Mailbox(_Total)/Active Client Logons

Note

These counters will provide somewhat different results, and will count Outlook Web Access connections differently than Outlook connections. To understand how the server is being used, monitor the changes in these counters during a typical work day

Scaling Up or Scaling Out

When considering how you can accommodate more users (or perhaps more messages per user) in your clustered environment, one option is to scale up. Scaling up refers to the process of using more powerful server components on your cluster nodes to meet increased performance demands. However, it is important to consider that, as you scale up the hardware on your cluster nodes (for example, so you can host more users on each node), the availability of each node becomes significantly more important.

An alternative to scaling up is to scale out. Scaling out refers to the process of adding nodes to a cluster.

To explain these two options, consider an organization that that hosts 3,000 users on a four-node cluster. The cluster has three active nodes (1,000 users per node) and one passive node. If the need to accommodate an additional 1,000 users emerges, the organization has two options:

  • Option 1: Scale up   Specifically, upgrade the RAM and CPUs on each of the cluster nodes and then distribute the additional 1,000 users evenly on among the nodes.

  • Option 2: Scale out   Specifically, add an additional node to the cluster. This changes the cluster configuration to a five-node cluster with four active nodes, each active node hosting 1,000 mailboxes.

In this example, if a disaster causes one of the servers to fail, implementing option two would affect fewer users. Therefore, when deploying Exchange in a cluster, consider scaling out as part of your scalability plan.

Scaling out can also increase the fault tolerance of your Exchange cluster. For example, a four-node, 2-active/2-passive cluster can handle more simultaneous failures than a four-node, 3-active/1-passive cluster. For more information about active/passive clustering, see "Active/Passive Clustering" in Understanding Exchange Server 2003 Clustering.

Testing Clustered Server Components

Before you deploy your clustered servers in a production environment, it is important that you test their capacity. The tools you use to test your cluster deployment are the same you use to test non-clustered servers (for example LoadSim and Jetstress). For information about the importance of laboratory testing and pilot deployments, see "Laboratory Testing and Pilot Deployments" in System-Level Fault Tolerant Measures.

The following lists provide testing considerations specific to server clustering.

Test the following hardware components:

  • Individual computer components such as hard disks, controllers, processors, and RAM

  • External components such as routers, bridges, switches, cabling, and connectors

Set up the following stress tests:

  • Test cluster performance under heavy network loads

  • Test cluster performance under heavy disk input/output (I/O) to the same disk

  • Test cluster performance under heavy Exchange services load

  • Test cluster performance under a large number of simultaneous logon attempts

  • Fail over each EVS at least once to each of the nodes. Do this under heavy Exchange services load

Use the output from these tests to:

  • Calculate the client response time for the server configuration under client load.

  • Estimate the number of users per server.

  • Identify bottlenecks on the server.

Cluster Hardware Compatibility

For Windows Server 2003, Enterprise Edition and Windows Server 2003, Datacenter Edition, Microsoft supports only complete server cluster systems selected from the Windows Server Catalog.

The support for third-party system components is limited based on the requirements of the third-party solutions. For more information, see Microsoft Knowledge Base article 814607, "Microsoft Support for Server Clusters with 3rd Party System Components."

In general, it is recommended that you use identical hardware (for example, identical processors, identical NICs, and the same amount of RAM) for each cluster node. For more information about why this is recommended, and when you may want to consider using asymmetrical hardware in your cluster nodes, see "Cluster Configurations" in Understanding Exchange Server 2003 Clustering.

Note

For a geographically dispersed cluster, both the hardware and software configuration must be certified and listed in the Windows Server Catalog. For information about hardware compatibility for geographically dispersed clusters, see "Qualified Configurations for Geographically Dispersed Clusters" later in this topic.

For more information about cluster hardware, see Microsoft Knowledge Base article 309395, "The Microsoft support policy for server clusters, the Hardware Compatibility List, and the Windows Server Catalog."

Geographically Dispersed Clustering

The main goal of a geographically dispersed cluster is to ensure that loss of one site does not cause a loss of the complete application. Geographically dispersed clustering provides enhanced availability and recoverability for Exchange e-mail services. (However, the cluster nodes in the alternate recovery site do not provide Exchange services unless a site failure occurs.) Moreover, in the event of a site disaster, geographically dispersed clusters provide fault tolerance and failover for specific applications. Many hardware and software solutions exist for Exchange geographically dispersed clustering that provide business continuity at both the site and cluster level.

As you plan your geographically dispersed cluster solution, be sure that you have answers to the following questions:

  • What main issues must my geographically dispersed cluster address?

  • What are the qualified configurations for a geographically dispersed cluster?

  • What Cluster service requirements must be met by a geographically dispersed clustering solution?

  • The remainder of this section provides information about each of these questions.

For general information about how geographically dispersed clustering helps provide fault tolerance for your Exchange 2003 organization, see "Using Multiple Physical Sites" in System-Level Fault Tolerant Measures.

Issues that a Geographically Dispersed Cluster Must Address

A geographically dispersed cluster must address the following issues:

  • How can you make sure that multiple sites have independent copies of the same data? How are data changes replicated across sites? If data is changed at one site, and that site fails, how are those changes transmitted to the remaining sites?

  • If one site fails, how can an application such as Exchange 2003 continue to provide Exchange services?

  • How can you make sure your geographically dispersed clusters are protected from natural disasters?

Solving the first issue does not present much of a problem for the replication of read-only data between physical sites—you can easily copy read-only data and an instance of that data can be hosted at each site. To solve the issue of data replication, you can implement software and hardware mirroring or synchronous replication. These replication techniques enable you to keep current data mirrors of each physical site.

To solve the second issue, you must implement a failover clustering solution. For this solution to work, the cluster nodes in separate physical sites must appear to the Cluster service as being on the same network. You can accomplish this by using virtual local area networks (VLANs). VLANS allow you to connect in separate physical locations over long distances.

To solve the third issue, make sure your sites are spaced far enough apart so that a natural disaster would not impact more than one site. Each site should have completely different power sources and different communications infrastructure providers.

The following figure illustrates a basic geographically dispersed cluster with these solutions in place.

Basic geographically dispersed cluster topology

59de5320-fb94-40a7-8633-8b660c3b6089

Qualified Configurations for Geographically Dispersed Clusters

A geographically dispersed cluster is a combination of hardware and software components supplied by original equipment manufacturers (OEMs) and software vendors. Exchange 2003 geographically dispersed cluster configurations can be complex, and the clusters must use only components supported by Microsoft. Geographically dispersed clusters should be deployed only in conjunction with vendors who provide qualified configurations.

In general, the restrictions that apply to Windows Server 2003 geographically dispersed clusters also apply to Exchange 2003. For detailed information about geographically dispersed clusters in Windows Server 2003, see Geographically Dispersed Clusters in Windows Server 2003.

The hardware in a geographically dispersed cluster must be qualified and appear on the Microsoft Hardware Compatibility List. For a separate Hardware Compatibility List for geographically dispersed clusters, see the Windows Server Catalog.

Note

You can create geographically dispersed clusters by adding data-replication software and extended LAN hardware to existing certified configurations. However, these solutions radically change the nature of a pre-certified configuration, particularly with respect to timing and latency. To be supported by Microsoft, the hardware and software configuration of a geographically dispersed cluster must be certified and listed on the cluster Hardware Compatibility List.

For additional information about the Hardware Compatibility List and Windows Clustering, see Microsoft Knowledge Base article 309395, "The Microsoft support policy for server clusters, the Hardware Compatibility List, and the Windows Server Catalog."

Cluster Service Technology Requirements

Windows Clustering software is unaware of the extended nature of geographically dispersed clusters. Specifically, the Cluster service does not include features that are unique to geographically dispersed cluster configurations. Therefore, the network and storage architecture of geographically dispersed clusters must meet the following requirements:

  • The private and public network connections must be in the same subnet (non-routed LAN). To implement this, use VLANs to ensure that all cluster nodes are on the same IP subnets.

  • The network connections must be able to provide a maximum guaranteed round trip latency between nodes of no more than 500 milliseconds. The cluster uses heartbeat to detect whether a node is alive or not responding. These heartbeats are sent out on a periodic basis (every 1.2 seconds). If a node takes too long to respond to heartbeat packets, the Cluster service starts a heavy-weight protocol to figure out which nodes are still functional and which ones are unavailable. This is known as a cluster re-group.

  • If you are using a standard-node quorum (also known as a single quorum), the cluster must have a single shared disk (known as the quorum disk).

    Note

    If you are running Exchange 2003 on Windows Server 2003, you can avoid this requirement by using the majority node set quorum. For more information about quorum types, see "Quorum Disk Resource" in Understanding Exchange Server 2003 Clustering.

    To make a set of disks in two separate sites appear to the Cluster service as a single disk, the storage infrastructure can provide mirroring across the sites. However, it must preserve the fundamental semantics that are required by the physical disk resource:

    • The Cluster service uses Small Computer System Interface (SCSI) reserve commands and bus reset to arbitrate for and protect the shared disks. The semantics of these commands must be preserved across the sites, even if communication between the sites fails. If a node on Site A reserves a disk, nodes on Site B should not be able to access the contents of the disk. To avoid data corruption of cluster and application data, these semantics are essential.

    • The quorum disk must be replicated in real-time, synchronous mode across all sites. The different members of a mirrored quorum disk must contain the same data.

Disaster Recovery Strategies for Clusters

For information about disaster recovery strategies specific to Exchange 2003 clusters, see "Backing up Exchange 2003 Clusters" and "Restoring Exchange 2003 Clusters" in the Exchange Server 2003 Disaster Recovery Operations Guide.