How OTG Configures the Exchange Server Cluster

Applies To: Windows Server 2003, Windows Server 2003 R2, Windows Server 2003 with SP1, Windows Server 2003 with SP2

This section describes the hardware, software, and server cluster configuration that the Operations and Technology Group (OTG) uses to manage messaging within Microsoft. Because the purpose of this document is to focus on the details of the clustering implementations, only limited information is provided regarding general network hardware configuration, Exchange 2003 configurations, and other configuration settings that are not specifically related to server clusters. However, each server cluster setting is discussed in depth.

  • For information about hardware configuration, see Hardware in the Server Cluster.

  • For information about the software configuration, see Software in the Server Cluster.

  • For information about the server cluster configuration, see Server Cluster Settings.

Hardware in the Server Cluster

This section describes the hardware configuration for each of the following components of the server cluster:

  • Active and Primary Passive Nodes

  • Alternate Passive Nodes

  • Network Adapters

  • Storage Components

Active and Primary Passive Nodes

The four active nodes and the primary passive node in the server cluster all use the same hardware configuration, which is based on the HP ProLiant DL580-G2 processor with the following specification. Each active node supports 4000 mailboxes. Because there are 4 active nodes, this server cluster supports approximately 16,000 mailboxes.

Processors

HP DL580-G2 Quad - 1.90 GHz, Hyperthreading-enabled

Memory

4 GB of Error Correction Coding (ECC) RAM

Controller

Array controller - Integrated 5i

Hard disk drives

2 x 36 GB RAID-1:

C$ Windows Server 2003, Enterprise Edition

D$ Exchange Application

Explanation

OTG chose this hardware to provide optimized performance during peak usage periods, which are typically experienced on Monday mornings. OTG enabled hyperthreading on these nodes, which has increased performance of the node by 20-25%. (Hyperthreading is a technology that Intel developed that enables multithreaded software applications to execute threads in parallel on a single multicore processor instead of processing threads in a linear fashion). In addition, by increasing the performance, each node can manage more mailboxes.

The hardware for the primary passive node was chosen for the same reasons as the active node hardware. In the event of a failover, the passive node needs to perform exactly like an active node in order to maintain dependable messaging.

Note

Users who are implementing Exchange 2003 or Windows Server 2003 should refer to the Windows Server Catalog (https://go.microsoft.com/fwlink/?LinkId=4303) before buying hardware to ensure compatibility. The Windows Server Catalog is the online source for customers and partners to find hardware and software that is compatible with the Windows Server 2003 family.

Alternate Passive Nodes

The two alternate nodes in the server cluster have a hardware configuration that is based on the HP ProLiant DL380-G2 dual processor with the following specification.

Processors

DL380-G2 dual - 2.4 GHz

Memory

2 GB RAM

Controller

Array controller - Integrated 5i

Hard disk drives

2 x 36 GB RAID1:

C$ Windows Server 2003, Enterprise Edition

D$ Exchange Application

Explanation

As discussed earlier in this paper, OTG chose hardware for the alternate nodes to increase resource efficiency. With this hardware, the alternate passive nodes can be used to facilitate the backup process updates to the system.

For more information about why OTG uses alternative passive nodes, see Reasons for Using Alternative Passive Nodes.

Note

Users who are implementing Exchange 2003 or Windows Server 2003 should refer to the Windows Server Catalog (https://go.microsoft.com/fwlink/?LinkId=4303) before buying hardware. The Windows Server Catalog is the online source for customers and partners to find hardware and software that is compatible with the Windows Server 2003 family.

Network Adapters

Each node in the server cluster has two network adapters installed. OTG has used NC7770 Gigabit Server Adapters that are connected to 100-MB switches.

The following settings appear on the General tab that is available when you configure or view the properties of the adapters in Cluster Administrator. You can use Cluster Administrator to configure a cluster and its nodes, resource groups, and resources.

Adapter 1

This network adapter enables node-to-node communication. It is connected to the private network that is used exclusively in the cluster. Adapter 2 carries all other communication.

Configuration setting Explanation

Name:

Internal

OTG has changed the default name for this network connection to clearly identify that it is used for internal communication within the network.

Enabled for cluster use is checked.

This adapter is available for the cluster to use.

Adapter 2

This network adapter enables both node-to-node (internal) and client-to-cluster (public) communication. Because this network adapter carries both kinds of communication, it can serve as a backup for Adapter 1.

Configuration setting Explanation

Name:

Internal + Public

OTG has changed the default name for this network connection to clearly identify that it is used for all communication within the network.

Enabled for cluster use is checked.

This adapter is available for the cluster to use.

Note

It is important to configure the private network to be the highest priority for internal cluster communication in the Cluster Administrator. It is also important to make sure the public network is first in the TCP/IP binding order.

Explanation

OTG uses two networks to ensure network availability for internal cluster communications. Both adapters are configured with static IP addresses. The use of dynamic IP addresses is not recommended in server clusters.

Note

The nodes of a cluster must be on the same subnet, but you can use virtual LAN (VLAN) switches on the interconnects between two nodes. (An interconnect is a private network that connects nodes in a cluster). If you use a VLAN, the point-to-point, round-trip latency must be less than 1/2 second and the link between two nodes must appear as a single point-to-point connection from the perspective of the Windows operating system running on the nodes. To avoid single points of failure, use independent VLAN hardware for the different paths between the nodes.

Storage Components

This section will discuss the storage design for the Exchange server cluster and the reasons for this configuration.

The following terminology is used in this section and throughout the document.

  • Disk group. A group of physical disks. Virtual disks are created across the disk groups.

  • Storage group. An Exchange 2003 component that supports five databases (mail stores) using either a single or multiple data device with a common log device.

  • Virtual server. A collection of services that appear to clients as a physical Windows-based server, but are not associated with a specific server. All virtual servers must include a Network Name resource and an IP Address resource.

  • Storage area network (SAN). A high-speed network that provides a direct connection between servers and storage, including shared storage, clusters, and disaster-recovery devices.

All of the nodes in the cluster are connected to two SANs, which provide communication between the nodes and the two shared storage components.

The shared storage components are two redundant Enterprise Virtual Arrays (HP StorageWorks EVA 5000s). Diagram 1 illustrates the relationship of the nodes and the storage components. Diagram 3 describes the storage configuration for each Exchange virtual server. OTG configured each EVA to support four Exchange virtual servers and a total of 16,000 200-MB mailboxes.

The following table summarizes the storage configuration for each EVA. Each EVA has five disk groups that are characterized by different numbers of hard disks, data types, and Vraid types.

EVA disk group Number of 72-GB disks Data type Vraid type

DG1

48

Data/SMTP

1

DG2

48

Data/SMTP

1

DG3

14

Log/Quorum

1

DG4

28

Backup

5

DG5

28

Backup

5

Explanation

OTG designed this cluster for optimum performance to support production requirements for peak loads. They also strived to ensure maximum availability by eliminating as many single points of failure within the environment. OTG achieved optimum performance by scaling the servers to support a realistic number of mailboxes and optimizing the SAN design for peak disk transfers (4K random read/write activity). OTG analyzed I/O requirements for each 200 MB mailbox and determined that each mailbox would generate between 1.0 and 1.2 disk transfers for peak loads (this is specific to the user environment at Microsoft).

With a plan to support 4000 mailboxes for each Exchange virtual server, OTG determined that each host would need to sustain 4800 disk transfers per second and also have the ability to support any unexpected increase in demand.

OTG needed to optimize the SAN design in order to anticipate and account for peak disk transfers while maintaining optimum read and write latencies. They also needed to ensure that Exchange would not encounter any bottlenecks.

After evaluating the performance of the HP StorageWorks EVA 5000s, OTG found that the EVA supported 12,000 transfers in an optimized configuration with acceptable latencies, even under an aggressive 4 KB random I/O load. As a result, OTG implemented two EVA 5000s per cluster. These EVAs support four Exchange virtual servers and a total of 16,000 200-MB mailboxes.

OTG uses HP StorageWorks Secure Path, which is a multipath application, to manage and maintain continuous data access to the storage system. Specifically, they use Secure Path to assist with optimizing disk I/O activity by ensuring disk distribution across fiber channel adapters (FCAs) and SAN controllers on each node. (Fiber channel adapters are expansion cards used to connect the nodes to SANs). Disk distribution has proven to be beneficial in maintaining an optimized environment. It has greatly reduced peak read/write disk latency and has improved online backup to disk.

OTG also uses Secure Path to assist with eliminating single points of failure between nodes and the storage. Secure Path helps maintain I/O even if a node fails. In the event of a node failure, Secure Path detects the failure and moves disks from the failed path to an available path. This process is called failover. Failover keeps the resources available while requiring no downtime. Once the failed nodes are replaced, the disks can be failed back using HPs Secure Path Manager to restore optimized I/O.

ee118bc7-024c-4e2f-bb35-0d30d2903ab9

Diagram 3 Storage Configuration for Each Exchange Virtual Server

Diagram 3 depicts the storage configuration for one of the Exchange virtual servers. OTG configured each EVA to support four Exchange virtual servers and a total of 16,000 200-MB mailboxes. Each Exchange virtual server hosts four storage groups with five databases each. Each database is configured with a 200-MB limit per mailbox. This design specifies a maximum of 200 mailboxes per database, so each Exchange virtual server contains 4,000 mailboxes.

For additional information about the storage configuration for each virtual server, see Appendix A. For more information about Enterprise Virtual Arrays, see Microsoft Exchange 2003 Storage Design for the EVA5000 and other related white papers in the Storage Array Systems White Papers section of the Hewlett-Packard Web site (https://go.microsoft.com/fwlink/?LinkId=19716).

Software in the Server Cluster

The nodes in the server cluster have the following software installed.

Node Type Software

Active node and primary passive node

  • Windows Server 2003, Enterprise Edition

  • Exchange 2003

  • Microsoft Operations Manager (MOM) Agent

  • HP StorageWorks Secure Path

Alternative passive node

  • Windows Server, 2003 Enterprise Edition

  • Exchange 2003

  • Microsoft Operations Manager (MOM) Agent

  • HP StorageWorks Secure Path

  • Veritas Backup Exec

OTG has made the following operating system modifications on each node:

  • /3-GB switch set in the Boot.ini file.

  • /USERVA=3030 parameter set in the Boot.ini file.

  • In the registry, SystemPages is set to 0.

  • Cluster event log replication is set to 0 in Event Viewer. Because OTG uses Microsoft Operations Manager to manage their clusters, this setting is set to 0 to eliminate event replication across all nodes in the cluster.

  • Mount points are used to support the log, SMTP, and backup drives in order to reduce the number of drive letters used.

Server Cluster Settings

This section describes how OTG has configured the settings for the Exchange server cluster and the reasons why they chose the settings. It also explains why some settings are specific to OTG and are not recommended for typical customer deployments. This section describes the settings that are configured for the server cluster as a whole, the settings that are specific to each resource group in the cluster, and each of the resource configurations.

The server cluster contains four Exchange virtual servers, four backup virtual servers, and one cluster virtual server. A virtual server is a collection of services that appear to clients as a physical Windows-based server, but are not associated with a specific server. All virtual servers must include a Network Name resource and an IP Address resource.

Each of the virtual servers contains the following specified resources. For server clusters, a resource is a physical or logical entity that is capable of being managed by a cluster, brought online and taken offline, and moved between nodes. A resource can be owned only by a single node at any point in time.

Virtual Server Resources

Exchange virtual servers

  • Nine Physical Disk resources

  • One Network Name resource

  • One IP Address resource

  • The following Exchange 2003 resources:

    • Exchange Message Transfer Agent instance (this resource is only in the first Exchange instance in the cluster)

    • Exchange Routing Service instance

    • SMTP virtual server instance

    • Exchange HTTP virtual server instance

    • Exchange IMAP4 virtual server instance

    • Exchange POP3 virtual server instance

    • Search Service instance

    • Exchange System Attendant

    • Exchange Information Store

Backup virtual servers

  • Four Physical Disk resources

  • Four File Share resources

  • One IP Address resource

  • One Network Name resource

Cluster virtual server

  • One Physical Disk resource

  • One IP Address resource

  • One Network Name resource

  • One Distribution Transaction Coordinator resource

  • One Majority Node Set resource

The following sections describe how OTG has configured the settings in Cluster Administrator. You can use Cluster Administrator to configure a cluster and its nodes, resource groups, and resources.

  • Cluster Properties

  • Exchange Virtual Server Properties

  • Backup Virtual Server Properties

  • Cluster Virtual Server Properties

  • Resource Settings

Cluster Properties

This section describes how OTG has configured the server cluster as a whole.

The following settings appear on the Quorum tab that is available when you configure or view the cluster properties.

Configuration setting Explanation

Quorum resource:

Majority Node Set

In a majority node set server cluster, each node maintains its own copy of the cluster configuration data. The Majority Node Set resource, acting as the Quorum Resource, ensures that the cluster configuration data is kept consistent across the different nodes. The majority node set cluster is only meant for use in targeted scenarios like a geographically dispersed cluster, which is a cluster that spans multiple sites.

OTG has used a majority node set server cluster primarily for validation and testing purposes. Majority node set clusters also reduce the likelihood of a single point of failure in certain hardware devices (such a host bus adapters or fibre switches), which can cause the entire cluster to fail.

ImportantImportant
Do not configure your cluster as a majority node set cluster unless you have a specific need to do so and it is part of a cluster solution offered by your Original Equipment Manufacturer (OEM), Independent Software Vendor (ISV), or Independent Hardware Vendor (IHV). The single quorum device server cluster will still cover the vast majority of your cluster deployment needs.

Reset quorum log at:

4096 KB

The maximum size of the quorum log is 4096 KB. This is enough space to hold the cluster configuration information, such as which servers are part of the cluster, what resources are installed in the cluster, and what state those resources are in (for example, online or offline).

The following settings appear on the Network Priority tab that is available when you configure or view the clusters properties.

Configuration setting Explanation

Networks used for internal cluster communications:

  • Internal

  • Internal + Public

The Internal network has the highest priority. The Internal + Public network will only be used if the Internal network becomes unavailable.

The following settings appear on the Security tab that is available when you configure or view the clusters properties.

Configuration setting Explanation

Group or user names:

  • Administrators have full Control

  • Network Service has full Control

  • System has full Control

These groups have permission to alter the configuration of this cluster.

Exchange Virtual Server Properties

This section describes how OTG has configured the Exchange virtual servers in the server cluster.

OTG has configured four Exchange virtual servers, which each host four storage groups. These storage groups contain a total of 20 databases. Each database is configured with a 200-MB limit per mailbox. This design specifies a maximum of 200 mailboxes per database, so each Exchange virtual server can contain a maximum of 4,000 mailboxes.

The following settings appear on the General tab that is available when you configure or view properties for the Exchange virtual servers.

Configuration setting Explanation

Preferred owners:

  • Active node 1

  • Primary passive node

  • Alternate passive node 1

  • Alternate passive node 2

The Exchange virtual server will fail over to the indicated nodes in the specified order. If the active node fails, the Exchange virtual server will fail over to the primary passive node. If the primary passive node is unavailable, the Exchange virtual server will fail over to the alternate passive nodes. If none of the nodes are available, this resource group will fail over to a randomly selected node (this also depends on the possible owner settings for its resources).

The following settings appear on the Failover tab that is available when you configure or view properties for the Exchange virtual servers.

Configuration setting Explanation

Threshold set to 10

Period set to 6

The Exchange virtual server is allowed to fail 10 times within 6 hours before it is left in a failed state. The 11th time a resource in the resource group fails, the Cluster service fails all other resources in the group and leaves the entire group offline instead of failing over the group.

The following setting appears on the Failback tab that is available when you configure or view properties for the Exchange virtual servers.

Configuration setting Explanation

Prevent Failback is checked.

The Exchange virtual server will not fail back when a failed node returns to operation. Instead, the Exchange virtual server will continue to run on the alternate node after the failed node comes back online. This requires administrator intervention to move the resource group back to the original, preferred node. OTG prefers this method because the administrator has complete control over when the failback occurs and can coordinate this to give users the least amount of downtime.

Backup Virtual Server Properties

This section describes how OTG has configured the four backup virtual servers in the server cluster. For more information about how the backup process works, see How OTG Backs Up the Server Cluster later in this document.

The following setting appears on the General tab that is available when you configure or view the properties for the backup virtual servers.

Configuration setting Explanation

There are no preferred owners for this resource group.

OTG has created various scheduled tasks that control the movement of the backup virtual servers between nodes. The scheduled tasks force the backup virtual servers to move between the passive and active nodes, depending on the stage of the backup process that is being completed. The movement of the backup virtual servers between nodes is also determined by the possible owner settings of their resources.

The following settings appear on the Failover tab that is available when you configure or view the properties for the backup virtual servers.

Configuration setting Explanation

Threshold set to 10

Period set to 6

This resource group is allowed to fail 10 times within 6 hours before it is left in a failed state. The 11th time a resource in the group fails, the Cluster service fails all other resources in the group and leaves the entire group offline instead of failing over the group.

The following setting appears on the Failback tab that is available when you configure or view the properties for the backup virtual servers.

Configuration setting Explanation

Prevent Failback is checked.

This resource group will not fail back when a failed node returns to operation. Instead, it will continue to run on the alternate node after the failed node comes back online. This requires administrator intervention to move the group back to the original, preferred node. OTG prefers this method because the administrator has complete control over when the failback occurs and can coordinate this to make sure that users experience the least amount of downtime.

Cluster Virtual Server Properties

This section describes how OTG has configured the cluster virtual server in the server cluster. The cluster virtual server is the owner of the cluster quorum resource. The quorum resource maintains the configuration data that is necessary for recovery of the cluster. This data contains details of all of the changes that have been applied to the cluster database.

The following setting appears on the General tab that is available when you configure or view properties for the cluster virtual server.

Configuration setting Explanation

Preferred owners:

  • Primary passive node

This resource group will first attempt to fail over to the primary passive node because it is best equipped to host the group. If this node is not available, this group will fail over to a randomly selected node (this also depends on the possible owner settings of its resources).

The following settings appear on the Failover tab that is available when you configure or view the properties for the cluster virtual server.

Configuration setting Explanation

Threshold set to 10

Period set to 6

This resource group is allowed to fail 10 times within 6 hours before it is left in a failed state. The 11th time a resource in the group fails, the Cluster service fails all other resources in the group and leaves the entire group offline instead of failing over the group.

The following settings appear on the Failback tab that is available when you configure or view the properties for the cluster virtual server.

Configuration setting Explanation

Prevent Failback is checked.

This resource group will not fail back when a failed node returns to operation. Instead, it will continue to run on the alternate node after the failed node comes back online. This requires administrator intervention to move the group back to the original, preferred node. OTG prefers this method because the administrator has complete control over when the failback occurs and can coordinate this to make sure that users experience the least amount of downtime.

Resource Settings

This section describes how each of the resources used in the server cluster is configured.

All resource settings in the server cluster are the default values with the following exceptions. These exceptions appear on the Advanced tab that is available when you configure or view the properties for the resource:

  • Restart is not enabled for the Exchange Information Store and the Exchange System Attendant resources.

  • Restart is enabled for all other resources (default) but the Affect the Group option is disabled.

For more information regarding why these changes were made, see each specific setting below

For more information about cluster resources, see the information about standard resource types at the Microsoft Web site (https://go.microsoft.com/fwlink/?LinkId=19502).

For more information about Exchange resources, see Product Documentation for Exchange at the Microsoft Web site (https://go.microsoft.com/fwlink/?LinkId=116208).

Note

Each resource of a specific type is configured identically unless otherwise noted.

IP Address Resource

The following settings appear on the General tab that is available when you configure or view the IP Address resource properties.

Configuration setting Explanation

All nodes are possible owners.

This resource can run on any of the available nodes, which allows failover to occur. During failover, this resource will fail over to a node based on the preferred owner settings of the resource group it belongs to.

Run as a separate Resource Monitor is not checked.

This resource does not run on a separate memory space.

The following setting appears on the Dependencies tab that is available when you configure or view the IP Address resource properties.

Configuration setting Explanation

This resource has no Resource Dependencies.

The Cluster service uses the dependencies list when bringing resources online and offline. This resource does not require another resource to operate.

The following settings appear on the Advanced tab that is available when you configure or view the IP Address resource properties.

Configuration setting Explanation

Restart is not enabled.

The Affect the group check box is not selected (with a threshold of 3 and a period of 900 seconds).

This resource restarts after a resource failure.

The entire resource group will not be failed over when this resource fails. When this resource is in a pending state, the Cluster service will attempt to restart the resource 3 times within 900 seconds before the Cluster service assigns the resource to Offline or Failed status.

ImportantImportant
All resources that restart after a resource failure are not configured to Affect the group. This is not a default setting. This allows OTG to track and fix any problems within the cluster. If a resource failure occurs, this setting allows OTG administrators to notify the Microsoft developers of possible bugs in the code.

This setting also inhibits a cluster resource group from alternating between nodes when a resource fails, which would result in significant instability. This problem can result when a failed disk has default properties that cause a series of bus resets. The resource group then attempts to acquire reservation on a disk that does not exist.

This configuration is only suitable for users with strong monitoring systems who want to control the location of their groups.

Looks Alive poll interval:

  • resource type value

The Cluster service will perform a cursory check the default number of times to determine if a resource is in the Online state.

Is Alive poll interval:

  • resource type value

The Cluster service will perform an in-depth check the default number of times to determine if the resource is in the Online state.

Pending timeout is set at 180 seconds.

This resource has 180 seconds when in a pending state (Online Pending or Offline Pending) to resolve its status, or at least report to the Resource Monitor that it is making progress, before the Cluster service will put the resource in Failed status.

The following settings appear on the Parameters tab that is available when you configure or view the IP Address resource properties.

Configuration setting Explanation

Address:

XXX.XX.XX.XXX

This parameter specifies the unique IP address. This address is a static IP address that is not already in use on the network. This was obtained by the network administrator.

Subnet mask:

XXX.XXX.XXX.XXX

This parameter denotes the subnet mask for the IP address specified. This subnet mask is the same subnet mask as the one for the associated cluster network. This was obtained by the network administrator.

Network:

Internal + Public

The mixed communication adapter is being used. For more information, see Network Adapters.

Enable NetBIOS for this address is checked.

The NetBIOS is enabled for the specified IP address. This allows a dependent network name to be published using NetBIOS.

Network Name Resource

The following settings appear on the General tab that is available when you configure or view the Network Name resource properties.

Configuration setting Explanation

All nodes are possible owners.

This resource can run on any of the available nodes, which allows failover to occur. During failover, this resource will fail over to a node based on the preferred owner settings of the resource group it belongs to.

Run as a separate Resource Monitor is not checked.

This resource does not run on a separate memory space.

The following setting appears on the Dependencies tab that is available when you configure or view the Network Name resource properties.

Configuration setting Explanation

Resource Dependencies:

  • IP Address Resource

The Cluster service uses the dependencies list when bringing resources online and offline. A dependent resource is one that requires another resource to operate. The network name resource is dependent on the IP Address that you want to correspond to the name.

The following settings appear on the Advanced tab that is available when you configure or view the Network Name resource properties.

Configuration setting Explanation

Restart is not enabled.

The Affect the group check box is not selected (with a threshold of 3 and a period of 900 seconds).

This resource restarts after a resource failure.

The entire resource group will not be failed over when this resource fails. When this resource is in a pending state, the Cluster service will attempt to restart the resource 3 times within 900 seconds before the Cluster service assigns the resource to Offline or Failed status.

ImportantImportant
All resources that restart after a resource failure are not configured to Affect the group. This is not a default setting. This allows OTG to track and fix any problems within the cluster. If a resource failure occurs, this setting allows OTG administrators to notify the Microsoft developers of possible bugs in the code.

This setting also inhibits a cluster resource group from alternating between nodes when a resource fails, which would result in significant instability. This problem can result when a failed disk has default properties that cause a series of bus resets. The resource group then attempts to acquire reservation on a disk that does not exist.

This configuration is only suitable for users with strong monitoring systems who want to control the location of their groups.

Looks Alive poll interval:

  • resource type value

The Cluster service will perform a cursory check the default number of times to determine if a resource is in the Online state.

Is Alive poll interval:

  • resource type value

The Cluster service will perform an in-depth check the default number of times to determine if the resource is in the Online state.

Pending timeout is set at 180 seconds.

This resource has 180 seconds when in a pending state (Online Pending or Offline Pending) to resolve its status, or at least report to the Resource Monitor that it is making progress, before the Cluster service will put the resource in Failed status.

The following settings appear on the Parameters tab that is available when you configure or view the Network Name resource properties.

Configuration setting Explanation  

DNS Registration must succeed.

Exchange virtual server:

Is not checked.

Domain Name System registration failures are tolerated for this Network Name resource.

 

Backup virtual server:

Is not checked.

Domain Name System registration failures are tolerated for this Network Name resource.

 

Cluster virtual server:

Is not checked.

Domain Name System registration failures are tolerated for this Network Name resource.

Enable Kerberos Authentication

Exchange virtual server:

Is checked.

Clients connecting to the network name represented by this resource can use Kerberos Authentication if they chose to. The System Attendant will not come online unless the Kerberos is enabled in the Network Name for the Exchange virtual server.

 

Backup virtual server:

Is not checked.

Clients connecting to the network name represented by this resource can only use NTLM authentication. This is the default configuration.

 

Cluster virtual server:

Is not checked.

Clients connecting to the network name represented by this resource can only use NTLM authentication. This is the default configuration.

Physical Disk Resource

The following settings appear on the General tab that is available when you configure or view the Physical Disk resource properties.

Configuration setting Explanation

All nodes are possible owners.

This resource can run on any of the available nodes, which allows failover to occur. During failover, this resource will fail over to a node based on the preferred owner settings of the resource group it belongs to.

Run as a separate Resource Monitor is not checked.

This resource does not run on a separate memory space.

The following settings appear on the Dependencies tab that is available when you configure or view the Physical Disk resource properties.

Configuration setting Explanation  

Resource Dependencies

Exchange virtual server:

Dependencies vary.

The Cluster service uses the dependencies list when bringing resources online and offline. This resource can have varying dependencies based on the following:

  • Mount point disks are dependent on their parent drive.

  • Log drives are dependent on the data drives.

  • Data drives have no dependencies.

 

Backup virtual server:

Dependencies vary.

The Cluster service uses the dependencies list when bringing resources online and offline. This resource can have varying dependencies, which enable the transfer of data during backup.

 

Cluster virtual server:

Has no dependencies.

The Cluster service uses the dependencies list when bringing resources online and offline. This resource does not require another resource to operate.

The following settings appear on the Advanced tab that is available when you configure or view the Physical Disk resource properties.

Configuration setting Explanation

Restart is not enabled.

The Affect the group check box is not selected (with a threshold of 3 and a period of 900 seconds).

This resource restarts after a resource failure.

The entire resource group will not be failed over when this resource fails. When this resource is in a pending state, the Cluster service will attempt to restart the resource 3 times within 900 seconds before the Cluster service assigns the resource to Offline or Failed status.

ImportantImportant
All resources that restart after a resource failure are not configured to Affect the group. This is not a default setting. This allows OTG to track and fix any problems within the cluster. If a resource failure occurs, this setting allows OTG administrators to notify the Microsoft developers of possible bugs in the code.

This setting also inhibits a cluster resource group from alternating between nodes when a resource fails, which would result in significant instability. This problem can result when a failed disk has default properties that cause a series of bus resets. The resource group then attempts to acquire reservation on a disk that does not exist.

This configuration is only suitable for users with strong monitoring systems who want to control the location of their groups.

Looks Alive poll interval:

  • resource type value

The Cluster service will perform a cursory check the default number of times to determine if a resource is in the Online state.

Is Alive poll interval:

  • resource type value

The Cluster service will perform an in-depth check the default number of times to determine if the resource is in the Online state.

Pending timeout is set at 180 seconds.

This resource has 180 seconds when in a pending state (Online Pending or Offline Pending) to resolve its status, or at least report to the Resource Monitor that it is making progress, before the Cluster service will put the resource in Failed status.

File Share Resource

The following settings appear on the General tab that is available when you configure or view the File Share resource properties.

Configuration setting Explanation

All nodes are possible owners.

This resource can run on any of the available nodes, which allows failover to occur. During failover, this resource will fail over to a node based on the preferred owner settings of the resource group it belongs to.

Run as a separate Resource Monitor is not checked.

This resource does not run on a separate memory space.

The following setting appears on the Dependencies tab that is available when you configure or view the File Share resource properties.

Configuration setting Explanation

Resource Dependencies vary.

The Cluster service uses the dependencies list when bringing resources online and offline. This resource can have varying dependencies which enable the transfer of data during backup.

The following settings appear on the Advanced tab that is available when you configure or view the File Share resource properties.

Configuration setting Explanation

Restart is not enabled.

The Affect the group check box is not selected (with a threshold of 3 and a period of 900 seconds).

This resource restarts after a resource failure.

The entire resource group will not be failed over when this resource fails. When this resource is in a pending state, the Cluster service will attempt to restart the resource 3 times within 900 seconds before the Cluster service assigns the resource to Offline or Failed status.

ImportantImportant
All resources that restart after a resource failure are not configured to Affect the group. This is not a default setting. This allows OTG to track and fix any problems within the cluster. If a resource failure occurs, this setting allows OTG administrators to notify the Microsoft developers of possible bugs in the code.

This setting also inhibits a cluster resource group from alternating between nodes when a resource fails, which would result in significant instability. This problem can result when a failed disk has default properties that cause a series of bus resets. The resource group then attempts to acquire reservation on a disk that does not exist.

This configuration is only suitable for users with strong monitoring systems who want to control the location of their resource groups.

Looks Alive poll interval:

  • resource type value

The Cluster service will perform a cursory check the default number of times to determine if a resource is in the Online state.

Is Alive poll interval:

  • resource type value

The Cluster service will perform an in-depth check the default number of times to determine if the resource is in the Online state.

Pending timeout is set at 180 seconds.

This resource has 180 seconds when in a pending state (Online Pending or Offline Pending) to resolve its status, or at least report to the Resource Monitor that it is making progress, before the Cluster service will put the resource in Failed status.

The following setting appears on the Parameters tab that is available when you configure or view the File Share resource properties.

Configuration setting Explanation

The maximum User Limit is allowed.

OTG has not limited the number of simultaneous users.

Majority Node Set Resource

The following settings appear on the General tab that is available when you configure or view the Majority Node Set resource properties.

Configuration setting Explanation

All nodes are possible owners.

This resource can run on any of the available nodes, which allows failover to occur. During failover, this resource will fail over to a node based on the preferred owner settings of the resource group it belongs to.

Run as a separate Resource Monitor is not checked.

This resource does not run on a separate memory space.

The following setting appears on the Dependencies tab that is available when you configure or view the Majority Node Set resource properties.

Configuration setting Explanation

This resource has no Resource Dependencies.

The Cluster service uses the dependencies list when bringing resources online and offline. This resource does not require another resource to operate.

The following settings appear on the Advanced tab that is available when you configure or view the Majority Node Set resource properties.

Configuration setting Explanation

Restart is not enabled.

The Affect the group check box is not selected (with a threshold of 3 and a period of 900 seconds).

This resource restarts after a resource failure.

The entire resource group will not be failed over when this resource fails. When this resource is in a pending state, the Cluster service will attempt to restart the resource 3 times within 900 seconds before the Cluster service assigns the resource to Offline or Failed status.

ImportantImportant
All resources that restart after a resource failure are not configured to Affect the group. This is not a default setting. This allows OTG to track and fix any problems within the cluster. If a resource failure occurs, this setting allows OTG administrators to notify the Microsoft developers of possible bugs in the code.

This setting also inhibits a cluster resource group from alternating between nodes when a resource fails, which would result in significant instability. This problem can result when a failed disk has default properties that cause a series of bus resets. The resource group then attempts to acquire reservation on a disk that does not exist.

This configuration is only suitable for users with strong monitoring systems who want to control the location of their groups.

Looks Alive poll interval:

  • resource type value

The Cluster service will perform a cursory check the default number of times to determine if a resource is in the Online state.

Is Alive poll interval:

  • resource type value

The Cluster service will perform an in-depth check the default number of times to determine if the resource is in the Online state.

Pending timeout is set at 180 seconds.

This resource has 180 seconds when in a pending state (Online Pending or Offline Pending) to resolve its status, or at least report to the Resource Monitor that it is making progress, before the Cluster service will put the resource in Offline status.

Distribution Transaction Coordinator Resource

The following settings appear on the General tab that is available when you configure or view the Distribution Transaction Coordinator resource properties.

Note

OTG has implemented a Distribution Transaction Coordinator resource because Exchange 2003 requires this resource in order to operate.

Configuration setting Explanation

All nodes are possible owners.

This resource can run on any of the available nodes, which allows failover to occur. During failover, this resource will fail over to a node based on the preferred owner settings of the resource group it belongs to.

Run as a separate Resource Monitor is not checked.

This resource does not run on a separate memory space.

The following setting appears on the Dependencies tab that is available when you configure or view the Distribution Transaction Coordinator resource properties.

Configuration setting Explanation

Resource Dependencies:

  • Network Name

  • Quorum Physical Disk

The Cluster service uses the dependencies list when bringing resources online and offline. This resource requires the Network Name resource and the Quorum physical disk resource in order to operate.

The following settings appear on the Advanced tab that is available when you configure or view the Distribution Transaction Coordinator resource properties.

Configuration setting Explanation

Restart is not enabled.

The Affect the group check box is not selected (with a threshold of 3 and a period of 900 seconds).

This resource restarts after a resource failure.

The entire resource group will not be failed over when this resource fails. When this resource is in a pending state, the Cluster service will attempt to restart the resource 3 times within 900 seconds before the Cluster service assigns the resource to Offline or Failed status.

ImportantImportant
All resources that restart after a resource failure are not configured to Affect the group. This is not a default setting. This allows OTG to track and fix any problems within the cluster. If a resource failure occurs, this setting allows OTG administrators to notify the Microsoft developers of possible bugs in the code.

This setting also inhibits a cluster resource group from alternating between nodes when a resource fails, which would result in significant instability. This problem can result when a failed disk has default properties that cause a series of bus resets. The resource group then attempts to acquire reservation on a disk that does not exist.

This configuration is only suitable for users with strong monitoring systems who want to control the location of their groups.

Looks Alive poll interval:

resource type value

The Cluster service will perform a cursory check the default number of times to determine if a resource is in the Online state.

Is Alive poll interval:

resource type value

The Cluster service will perform an in-depth check the default number of times to determine if the resource is in the Online state.

Pending timeout is set at 180 seconds.

This resource has 180 seconds when in a pending state (Online Pending or Offline Pending) to resolve its status, or at least report to the Resource Monitor that it is making progress, before the Cluster service will put the resource in Failed status.

Exchange Resources

This section describes how OTG has configured the following Exchange resource properties. The Exchange System Attendant resource and Exchange Information Store resource have different configurations, which are described at the end of this section.

  • Exchange Message Transfer Agent instance

  • Exchange Routing Service instance

  • Exchange SMTP virtual server instance

  • Exchange HTTP virtual server instance

  • Exchange IMAP4 virtual server instance

  • Exchange POP3 virtual server instance

  • Search Service instance

Note

After the Exchange System Attendant resource is created, the cluster automatically generates all other Exchange resources.

The following settings appear on the General tab that is available when you configure or view the various Exchange resource properties.

Configuration setting Explanation

All nodes are possible owners.

This resource can run on any of the available nodes, which allows failover to occur. During failover, this resource will fail over to a node based on the preferred owner settings of the resource group it belongs to.

Run as a separate Resource Monitor is not checked.

These resources do not run on a separate memory space.

The following setting appears on the Dependencies tab that is available when you configure or view the various Exchange resource properties.

Configuration setting Explanation

Resource Dependencies:

  • System Attendant

The Cluster service uses the dependencies list when bringing resources online and offline. These resources require the System Attendant resource in order to operate.

The following settings appear on the Advanced tab that is available when you configure or view any of the various Exchange resource properties.

Configuration setting Explanation

Restart is not enabled.

The Affect the group check box is not selected (with a threshold of 3 and a period of 900 seconds).

This resource restarts after a resource failure.

The entire resource group will not be failed over when this resource fails. When this resource is in a pending state, the Cluster service will attempt to restart the resource 3 times within 900 seconds before the Cluster service assigns the resource to Offline or Failed status.

ImportantImportant
All resources that restart after a resource failure are not configured to Affect the group. This is not a default setting. This allows OTG to track and fix any problems within the cluster. If a resource failure occurs, this setting allows OTG administrators to notify the Microsoft developers of possible bugs in the code.

This setting also inhibits a cluster resource group from alternating between nodes when a resource fails, which would result in significant instability. This problem can result when a failed disk has default properties that cause a series of bus resets. The resource group then attempts to acquire reservation on a disk that does not exist.

This configuration is only suitable for users with strong monitoring systems who want to control the location of their groups.

Looks Alive poll interval:

  • resource type value

The Cluster service will perform a cursory check the default number of times to determine if a resource is in the Online state.

Is Alive poll interval:

  • resource type value

The Cluster service will perform an in-depth check the default number of times to determine if the resource is in the Online state.

Pending timeout is set at 180 seconds.

This resource has 180 seconds when in a pending state (Online Pending or Offline Pending) to resolve its status, or at least report to the Resource Monitor that it is making progress, before the Cluster service will put the resource in Failed status.

The following settings appear on the General tab that is available when you configure or view the Exchange System Attendant resource properties.

Configuration setting Explanation

Possible Owners:

  • Active node 1

  • Primary passive node

  • Alternate passive node 1

  • Alternate passive node 2

This resource drives the failover path for the Exchange virtual servers because this resource can only run on these specified nodes. If the active node fails, the resource group will fail over to the primary passive node. If the primary passive node is unavailable, the group will fail over to the alternate passive nodes. This ensures that this resource group can only become active on its corresponding active nodes and all passive nodes.

Run as a separate Resource Monitor is not checked.

This resource does not run on a separate memory space.

The following setting appears on the Dependencies tab that is available when you configure or view the Exchange System Attendant resource properties.

Configuration setting Explanation

Resource Dependencies:

  • 5 physical disks

  • Network Name

The Cluster service uses the dependencies list when bringing resources online and offline. This System Attendant resource is dependent on five mount drive devices (which are dependent on their parent drives) and the Network name resource. The System Attendant resource requires these resources in order to operate.

The following settings appear on the Advanced tab that is available when you configure or view the Exchange System Attendant resource properties.

Configuration setting Explanation

Restart is not enabled.

This resource does not restart after a resource failure. This allows OTG to track and fix any problems within the cluster. If a resource failure occurs, this setting allows OTG administrators to notify the Microsoft developers of possible bugs in the code.

This setting is specific to OTG as it allows for more precise fault isolation that can then be reported to the server cluster developers. It is not recommended for typical customer deployments.

Looks Alive poll interval:

  • resource type value

The Cluster service will perform a cursory check the default number of times to determine if this resource is in the Online state.

Is Alive poll interval:

  • resource type value

The Cluster service will perform an in-depth check the default number of times to determine if this resource is in the Online state.

Pending timeout is set at 180 seconds.

This resource has 180 seconds when in a pending state (Online Pending or Offline Pending) to resolve its status, or at least report to the Resource Monitor that it is making progress, before the Cluster service will put the resource in Failed status.

The following settings appear on the General tab that is available when you configure or view the Exchange Information Store resource properties.

Configuration setting Explanation

All nodes are possible owners.

This resource can run on any of the available nodes, which allows failover to occur. During failover, this resource will fail over to a node based on the preferred owner settings of the resource group it belongs to.

Run as a separate Resource Monitor is not checked.

This resource does not run on a separate memory space.

The following setting appears on the Dependencies tab that is available when you configure or view the Exchange Information Store resource properties.

Configuration setting Explanation

Resource Dependencies:

System Attendant

The Cluster service uses the dependencies list when bringing resources online and offline. This resource requires the System Attendant resource in order to operate.

The following settings appear on the Advanced tab that is available when you configure or view the Exchange Information Store Resource Properties.

Configuration setting Explanation

Restart is not enabled.

This setting is specific to OTG as it allows for more precise fault isolation that can then be reported to the server cluster developers. It is not recommended for typical customer deployments.

Looks Alive poll interval:

  • resource type value

The Cluster service will perform a cursory check the default number of times to determine if this resource is in the Online state.

Is Alive poll interval:

  • resource type value

The Cluster service will perform an in-depth check the default number of times to determine if this resource is in the Online state.

Pending timeout is set at 180 seconds.

This resource has 180 seconds when in a pending state (Online Pending or Offline Pending) to resolve its status, or at least report to the Resource Monitor that it is making progress, before the Cluster service will put the resource in Failed status.