High Availability and Site Resilience
Applies to: Exchange Server 2013
Topic Last Modified: 2013-09-18
Mailbox databases and the data they contain are one of the most critical components (perhaps the most critical component) of any Exchange organization. In Microsoft Exchange Server 2013, you can protect mailbox databases and the data they contain by configuring your mailbox databases for high availability and site resilience. Exchange 2013 reduces the cost and complexity of deploying a highly available and resilient messaging solution while providing higher levels of end-to-end availability and supporting large mailboxes. Building on the native replication capabilities and high availability architecture in Exchange 2010, Exchange 2013 enables customers of all sizes and in all segments to economically deploy a messaging continuity service in their organization.
Looking for management tasks related to high availability and site resilience? Check out Managing High Availability and Site Resilience.
The following key terms are important to understand high availability or site resilience:
- Active Manager
An internal Exchange component which runs inside the Microsoft Exchange Replication service that's responsible for failure monitoring and corrective action through failover within a database availability group (DAG).
A property setting of a Mailbox server that determines whether a passive database copy will automatically mount as the new active copy, based on the number of log files missing by the copy being mounted.
- Continuous replication - block mode
In block mode, as each update is written to the active database copy's active log buffer, it's also shipped to a log buffer on each of the passive mailbox copies in block mode. When the log buffer is full, each database copy builds, inspects, and creates the next log file in the generation sequence.
- Continuous replication - file mode
In file mode, closed transaction log files are pushed from the active database copy to one or more passive database copies.
- Database availability group
A group of up to 16 Exchange 2013 Mailbox servers that hosts a set of replicated databases.
- Database mobility
The ability of an Exchange 2013 mailbox database to be replicated to and mounted on other Exchange 2013 Mailbox servers.
Typically this refers to an Active Directory site; however, it can also refer to a physical site. In the context of this documentation, datacenter equals Active Directory site.
- Datacenter Activation Coordination mode
A property of the DAG setting that, when enabled, forces the Microsoft Exchange Replication service to acquire permission to mount databases at startup.
- Disaster recovery
Any process used to manually recover from a failure. This can be a failure that affects a single item, or it can be a failure that affects an entire physical location.
- Exchange third-party replication API
An Exchange-provided API that enables use of third-party synchronous replication for a DAG instead of continuous replication.
- High availability
A solution that provides service availability, data availability, and automatic recovery from failures that affect the service or data (such as a network, storage, or server failure).
- Incremental deployment
The ability to deploy high availability and site resilience after Exchange 2013 is installed.
- Lagged mailbox database copy
A passive mailbox database copy that has a log replay lag time greater than zero.
- Mailbox database copy
A mailbox database (.edb file and logs), which is either active or passive.
- Mailbox resiliency
The name of a unified high availability and site resilience solution in Exchange 2013.
- Managed availability
A set of internal processes made up of probes, monitors, and responders that incorporate monitoring and high availability across all server roles and all protocols.
- *over (pronounced "star over")
Short for switchovers and failovers. A switchover is a manual activation of one or more database copies. A failover is an automatic activation of one or more database copies after a failure.
- Safety Net
Formerly known as transport dumpster, this is a feature of the transport service that stores a copy of all messages for X days. The default setting is 2 days.
- Shadow redundancy
A transport server feature that provides redundancy for messages for the entire time they're in transit.
- Site resilience
A configuration that extends the messaging infrastructure to multiple Active Directory sites to provide operational continuity for the messaging system in the event of a failure affecting one of the sites.
A DAG is the base component of the high availability and site resilience framework built into Exchange 2013. A DAG is a group of up to 16 Mailbox servers that host a set of databases and provides automatic, database-level recovery from failures that affect individual databases, networks, or servers. Any server in a DAG can host a copy of a mailbox database from any other server in the DAG. When a server is added to a DAG, it works with the other servers in the DAG to provide automatic recovery from failures that affect mailbox databases, such as a disk failure or server failure. For more information about DAGs, see Database Availability Groups.
The high availability and site resilience features used first introduced in Exchange 2010 are used in Exchange 2013 to create and maintain database copies. Exchange 2013 also leverages the concept of database mobility, which is Exchange-managed database-level failovers.
Database mobility disconnects databases from servers and adds support for up to 16 copies of a single database. It also provides a native experience for creating copies of a database.
Setting a database copy as the active mailbox database is known as a switchover. When a failure affecting a database or access to a database occurs and a new database becomes the active copy, this process is known as a failover. This process also refers to a server failure in which one or more servers bring online the databases previously online on the failed server. When either a switchover or failover occurs, other Exchange 2013 servers become aware of the switchover almost immediately and redirect client and messaging traffic to the new active database.
For example, if an active database in a DAG fails because of an underlying storage failure, Active Manager will automatically recover by failing over to a database copy on another Mailbox server in the DAG. In Exchange 2013, managed availability adds new behaviors to recover from loss of protocol access to a database, including recycling application worker pools, restarting services and servers, and initiating database failovers.
For more information about mailbox database copies, see Mailbox Database Copies.
Exchange 2013 uses DAGs and mailbox database copies, along with other features such as single item recovery, retention policies, and lagged database copies, to provide high availability, site resilience, and Exchange native data protection. The high availability platform, Exchange Information Store and Extensible Storage Engine (ESE) have all been enhanced to provide greater availability and easier management, and to reduce costs. These enhancements include:
- Reduction in IOPS over Exchange 2010 This enables you to leverage larger disks in terms of capacity and IOPS as efficiently as possible.
- Managed availability With managed availability, internal monitoring and recovery-oriented features are tightly integrated to help prevent failures, proactively restore services, and initiate server failovers automatically or alert administrators to take action. The focus is on monitoring and managing the end-user experience rather than just server and component uptime to help keep the service continuously available.
- Managed Store The Managed Store is the name of the newly rewritten Information Store processes in Exchange 2013. The new Managed Store is written in C# and tightly integrated with the Microsoft Exchange Replication service (MSExchangeRepl.exe) to provide higher availability through improved resiliency.
- Support for multiple databases per disk Exchange 2013 includes enhancements that enable you to support multiple databases (mixtures of active and passive copies) on the same disk, thereby leveraging larger disks in terms of capacity and IOPS as efficiently as possible.
- Automatic reseed Enables you to quickly restore database redundancy after disk failure. If a disk fails, the database copy stored on that disk is copied from the active database copy to a spare disk on the same server. If multiple database copies were stored on the failed disk, they can all be automatically reseeded on a spare disk. This enables faster reseeds, as the active databases are likely to be on multiple servers and the data is copied in parallel.
- Automatic recovery from storage failures This feature continues the innovation introduced in Exchange 2010 to allow the system to recover from failures that affect resiliency or redundancy. In addition to the Exchange 2010 bugcheck behaviors, Exchange 2013 includes additional recovery behaviors for long I/O times, excessive memory consumption by MSExchangeRepl.exe, and severe cases where the system is in such a bad state that threads can't be scheduled.
- Lagged copy enhancements Lagged copies can now care for themselves to a certain extent using automatic log play down. Lagged copies will automatically play down log files in a variety of situations, such as page patching and low disk space scenarios. If the system detects that page patching is required for a lagged copy, the logs will be automatically replayed into the lagged copy to perform page patching. Lagged copies will also invoke this auto replay feature when a low disk space threshold has been reached, and when the lagged copy has been detected as the only available copy for a specific period of time. In addition, lagged copies can leverage Safety Net, making recovery or activation much easier.
- Single copy alert enhancements The single copy alert introduced in Exchange 2010 is no longer a separate scheduled script. It's now integrated into the managed availability components within the system and is a native function within Exchange.
- DAG network auto-configuration DAG networks can be automatically configured by the system based on configuration settings. In addition to manual configuration options, DAGs can also distinguish between MAPI and replication networks and configure DAG networks automatically.
In Exchange 2010, passive database copies have a very low checkpoint depth, which is required for fast failover. In addition, the passive copy performs aggressive pre-reading of data to keep up with a 5-megabyte (MB) checkpoint depth. As a result of using a low checkpoint depth and performing these aggressive pre-read operations, IOPS for a passive database copy was equal to IOPS for an active copy in Exchange 2010.
In Exchange 2013, the system is able to provide fast failover while using a high checkpoint depth on the passive copy (100 MB). Because passive copies have 100-MB checkpoint depth, they've been de-tuned to no longer be so aggressive. As a result of increasing the checkpoint depth and de-tuning the aggressive pre-reads, IOPS for a passive copy is about 50 percent of the active copy IOPS in Exchange 2013.
Having a higher checkpoint depth on the passive copy also results in other changes. On failover in Exchange 2010, the database cache is flushed as the database is converted from a passive copy to an active copy. In Exchange 2013, ESE logging was rewritten so that the cache is persisted through the transition from passive to active. Because ESE doesn't need to flush the cache, you get fast failover.
One other change was made to the background database maintenance (BDM) process. BDM now processes around 1-2 MB per second per copy.
As a result of these changes, Exchange 2013 provides a 50 percent reduction in IOPS over Exchange 2010.
Managed availability is the integration of built-in, active monitoring and the Exchange 2013 high availability platform. With managed availability, the system can make a determination on when to fail over a database based on service health. Managed availability is an internal infrastructure that's deployed on the Client Access and Mailbox server roles in Exchange 2013. Managed availability includes three main asynchronous components that are constantly doing work. The first component is the probe engine, which is responsible for taking measurements on the server and collecting data. The results of those measurements flow into the second component, the monitor. The monitor contains all of the business logic used by the system based on what is considered healthy on the data collected. Similar to a pattern recognition engine, the monitor looks for the various different patterns on all the collected measurements, and then it decides whether something is considered healthy. Finally, there is the responder engine, which is responsible for recovery actions. When something is unhealthy, the first action is to attempt to recover that component. This could include multi-stage recovery actions; for example, the first attempt may be to restart the application pool, the second may be to restart the service, the third attempt may be to restart the server, and the subsequent attempt may be to take the server offline so that it no longer accepts traffic. If the recovery actions are unsuccessful, the system escalates the issue to a human through event log notifications.
Managed availability is implemented in the form of two services:
- Exchange Health Manager Service (MSExchangeHMHost.exe) This is a controller process that's used to manage worker processes. It's used to build, execute, and start and stop the worker process as needed. It's also used to recover the worker process in case that process crashes, to prevent the worker process from being a single point of failure.
- Exchange Health Manager Worker process (MSExchangeHMWorker.exe) This is the worker process that's responsible for performing the runtime tasks.
Managed availability uses persistent storage to perform its functions:
XML configuration files are used to initialize the work item definitions during startup of the worker process.
The registry is used to store runtime data, such as bookmarks.
The crimson channel event log infrastructure is used to store the work item results.
For more information about managed availability, see Monitoring Database Availability Groups.
Although the storage improvements in Exchange 2013 are designed primarily for just a bunch of disks (JBOD) configurations, they're available for use by all supported storage configurations. One such feature is the ability to host multiple databases on the same volume. This feature is about Exchange optimizing for large disks. These optimizations result in a much more efficient use of large disks in terms of capacity, IOPS, and reseed times, and they're meant to address the challenges associated with running in a JBOD storage configuration:
Database sizes must be manageable.
Reseed operations must be fast and reliable.
Although storage capacity is increasing, IOPS aren't.
Disks hosting passive database copies are underutilized in terms of IOPS.
Lagged copies have asymmetric storage requirements.
Limited agility exists to recover from low disk space conditions.
The trend of increasing storage capacity is continuing, with 8-terabyte drives expected to be available soon. When using 8-terabyte drives in conjunction with the Exchange maximum database size best practices guidelines (2 terabytes), you would waste more than 5 terabytes of disk space. One solution would be to simply grow the databases larger, but that inhibits manageability because it introduces long reseed times, including in some cases, operationally unmanageable reseed times, and the reliability of copying that amount of data over the network is compromised.
In addition, in the Exchange 2010 model, the disk storing a passive copy is underutilized in terms of IOPS. In the case of a lagged passive copy, not only is the disk underutilized in terms of IOPS, but it's also asymmetric in terms of its size, relative to the disks used to store the active and non-lagged passive copies.
Continuing a long-standing practice, Exchange 2013 is optimized so that it can use large disks (8 terabytes) in a JBOD configuration more efficiently. In Exchange 2013, with multiple databases per disk, you can have the same size disks storing multiple database copies, including lagged copies. The goal is to drive the distribution of users across the number of volumes that exist, providing you with a symmetric design where during normal operations each DAG member hosts a combination of active, passive, and optional lagged copies on the same volumes.
An example of a configuration that uses multiple databases per volume is illustrated below.
Configuration that uses multiple databases per volume
The above configuration provides a symmetrical design. All four servers have the same four databases all hosted on a single disk per server. The key is that the number of copies of each database that you have should be equal to the number of database copies per disk. In the above example, there are four copies of each database: one active copy, two passive copies, and one lagged copy. Because there are four copies of each database, the proper configuration is one that has four copies per volume. In addition, activation preference is configured so that it's balanced across the DAG and across each server. For example, the active copy will have an activation preference value of 1, the first passive copy will have an activation preference value of 2, the second passive copy will have an activation preference value of 3, and the lagged copy will have an activation preference value of 4.
In addition to having a better distribution of users across the existing volumes, another benefit of using multiple databases per disk is that it reduces the amount of time to restore data protection in the event of a failure that necessitates a reseed (for example, disk failure).
As a database gets bigger, reseeding the database takes longer. For example, a 2-terabyte database could take 23 hours to reseed, whereas an 8-terabyte database could take as long as 93 hours (almost 4 days). Both seeds would occur at about 20 MB per second. This generally means that a very large database can't be seeded within an operationally reasonable amount of time.
In the case of a single database copy per disk scenario, the seeding operation is effectively source-bound, because it's always seeding the disk from a single source. By dividing the volume into multiple database copies, and by having the active copy of the passive databases on a specified volume stored on separate DAG members, the system is no longer source bound in the context of reseeding the disk. When a failed disk is replaced, it can be reseeded from multiple sources. This allows the system to reseed and restore data protection for these databases in a much shorter amount of time.
When using multiple databases per volume, we recommend adhering to the following best practices and requirements:
A single logical disk partition per physical disk must be used. Don't create multiple partitions on the disk. Each database copy and its companion files (such as transaction logs and content index) should be hosted in a unique directory on the single partition.
The number of database copies configured per volume should be equal to the number of copies of each database. For example, if you have four copies of your databases, you should use four database copies per volume.
Database copies should have the same neighbors. (For example, they should all share the same disk on each server.)
Activation preference across the DAG should be balanced, such that each database copy on a specified disk has a unique activation preference value.
Automatic reseed, or AutoReseed, is a feature that's the replacement for what is normally administrator-driven action in response to a disk failure, database corruption event, or other issue that necessitates a reseed of a database copy. AutoReseed is designed to automatically restore database redundancy after a disk failure by using spare disks that have been provisioned on the system.
In an AutoReseed configuration, a standardized storage presentation structure is used, and the administrator picks the starting point. AutoReseed is about restoring redundancy as soon as possible after a drive fails. This involves pre-mapping a set of volumes (including spare volumes) and databases using mount points. In the event of a disk failure where the disk is no longer available to the operating system, or is no longer writable, a spare volume is allocated by the system, and the affected database copies are reseeded automatically. AutoReseed uses the following process:
The Microsoft Exchange Replication service periodically scans for copies that have a status of FailedAndSuspended.
When it finds a copy with that status, it performs some prerequisite checks, such as whether this is a single copy situation, whether spares are available, and whether anything could prevent the system from performing an automatic reseed.
If the prerequisite checks pass successfully, the Microsoft Exchange Replication service allocates and remaps a spare.
Next, the seeding operation is performed.
After the seed has been completed, the Microsoft Exchange Replication service verifies that the newly seeded copy is healthy.
At this point, if the failure was a disk failure, it would require manual intervention by an operator or administrator to remove and replace the failed disk, and then format, initialize, and reconfigure the disk as a spare.
AutoReseed is configured using three properties of the DAG. Two of the properties refer to the two mount points that are in use. Exchange 2013 leverages the fact that Windows Server allows multiple mount points per volume. The AutoDagVolumesRootFolderPath property refers to the mount point that contains all of the available volumes. This includes volumes that host databases and spare volumes. The AutoDagDatabasesRootFolderPath property refers to the mount point that contains the databases. A third DAG property, AutoDagDatabaseCopiesPerVolume, is used to configure the number of database copies per volume.
An example AutoReseed configuration is illustrated below.
Example AutoReseed configuration
In this example, there are three volumes, two of which will contain databases (VOL1 and VOL2), and one of which is a blank, formatted spare (VOL3).
To configure AutoReseed:
All three volumes are mounted under a single mount point. In this example, a mount point of C:\ExchVols is used. This represents the directory used to get storage for Exchange databases.
The root directory of the mailbox databases is mounted as another mount point. In this example, a mount point of C:\ExchDBs is used. Next, a directory structure is created so that a parent directory is created for the database, and under the parent directory, two subdirectories are created: one database file and one for the log files.
Databases are created. The above example illustrates a simple design using a single database per volume. Thus, on VOL1, there are three directories: the parent directory and two subdirectories (one for MDB1's database file, and one for its logs). Although not depicted in the example image, on VOL2, there would also be three directories: the parent directory, and under that, a directory for MDB2's database file, and one for its log files.
In this configuration, if MDB1 or MDB2 were to experience a failure, a copy of the failed database will be automatically reseeded to VOL3.
For more information about configuring AutoReseed, see Configure AutoReseed for a Database Availability Group.
Automatic recovery from storage failures continues the innovation introduced in Exchange 2010 to allow the system to recover from failures that affect resiliency or redundancy. In addition to the Exchange 2010 bugcheck behaviors, Exchange 2013 includes additional recovery behaviors for long I/O times, excessive memory consumption by the Microsoft Exchange Replication service (MSExchangeRepl.exe), and severe cases where threads can't be scheduled.
Even in JBOD environments, storage array controllers can have issues, such as crashing or hanging. Exchange 2010 included hung I/O detection and recovery features that provided enhanced resilience. These features are listed in the following table.
ESE Database Hung IO Detection
ESE checks for outstanding I/Os
Generates a failure item in the crimson channel to restart the server
Failure Item Channel Heartbeat
Ensures failure items can be written to and read from crimson channel
Replication service heartbeats crimson channel and restart server on failures
System Disk Heartbeat
Verifies server's system disk state
Periodically sends unbuffered I/O to system disk; restarts server on heartbeat time out
Exchange 2013 enhances server and storage resilience by including new behaviors for other serious conditions. These conditions and behaviors are described in the following table.
System bad state
No threads, including non-managed threads, can be scheduled
Restart the server
Long I/O times
I/O operation latency measurements
Restart the server
Replication service memory use
Measure the working set of MSExchangeRepl.exe
| || |
4 gigabyte (GB)
System Event 129 (Bus reset)
Check for Event 129 in System event log.
Restart the server
When event occurs.
Lagged copy enhancements include integration with Safety Net and automatic play down of log files in certain scenarios. Safety Net is a feature of transport that replaces the Exchange 2010 feature known as transport dumpster. Safety Net is similar to transport dumpster, in that it's a delivery queue that's associated with the Transport service on a Mailbox server. This queue stores copies of messages that were successfully delivered to the active mailbox database on the Mailbox server. Each active mailbox database on the Mailbox server has its own queue that stores copies of the delivered messages. You can specify how long Safety Net stores copies of the successfully delivered messages before they expire and are automatically deleted.
Safety Net takes some responsibility from shadow redundancy in DAG environments. In DAG environments, shadow redundancy doesn't need to keep another copy of the delivered message in a shadow queue while it waits for the delivered message to replicate to the passive copies of mailbox databases on the other Mailbox servers in the DAG. The copy of the delivered message is already stored in Safety Net, so shadow redundancy can redeliver the message from Safety Net if necessary.
With the introduction of Safety Net, activating a lagged database copy becomes significantly easier. For example, consider a lagged copy that has a 2-day replay lag. In that case, you would configure Safety Net for a period of 2 days. If you encounter a situation in which you need to use your lagged copy, you can suspend replication to it, and copy it twice (to preserve the lagged nature of the database and to create an extra copy in case you need it). Then, take a copy and discard all the log files, except for those in the required range. Mount the copy, which triggers an automatic request to Safety Net to redeliver the last two days of mail. With Safety Net, you don't need to hunt for where the point of corruption was introduced. You get the last two days mail, minus the data ordinarily lost on a lossy failover.
Lagged copies can now care for themselves by invoking automatic log replay to play down the log files in certain scenarios:
When a low disk space threshold is reached
When the lagged copy has physical corruption and needs to be page patched
When there are fewer than three available healthy copies (active or passive) for more than 24 hours
In Exchange 2010, page patching wasn't available for lagged copies. In Exchange 2013, page patching is available for lagged copies through this automatic play down feature. If the system detects that page patching is required for a lagged copy, the logs are automatically replayed into the lagged copy to perform page patching. Lagged copies also invoke this auto replay feature when a low disk space threshold has been reached, and when the lagged copy has been detected as the only available copy for a specific period of time.
Lagged copy play down behavior is disabled by default, and can be enabled by running the following command.
Set-DatabaseAvailabilityGroup <DAGName> -ReplayLagManagerEnabled $true
After being enabled, play down occurs when there are fewer than three copies. You can change the default value of 3, by modifying the following registry value.
To enable play down for low disk space thresholds, you must configure the following registry entry.
Ensuring that your servers are operating reliably and that your mailbox database copies are healthy are primary objectives of daily Exchange 2013 messaging operations. You must actively monitor the hardware, the Windows operating system, and the Exchange services. But when running in an Exchange 2013 mailbox resiliency environment, it's important that you monitor the health and status of the DAG and your mailbox database copies. It's especially vital to perform data redundancy risk management and monitor for periods in which a replicated database is down to just a single copy. This is particularly critical in environments that don't use Redundant Array of Independent Disks (RAID) and instead deploy JBOD configurations. In a RAID environment, a single disk failure doesn't affect an active mailbox database copy. However, in a JBOD environment, a single disk failure will trigger a database failover.
In Exchange 2010, the script CheckDatabaseRedundancy.ps1 was introduced. As its name implies, the purpose of the script was to monitor the redundancy of replicated mailbox databases by validating that there is at least two configured, healthy, and current copies, and to alert an administrator through event log generation when only a single healthy copy of a replicated database exists. In this case, both active and passive copies are counted when determining redundancy.
Single copy conditions include, but aren't limited to:
Failure of an active copy to replicate to any passive copy.
Failure of all passive copies, which includes FailedAndSuspended and Failed states in addition to healthy states where the copy is behind in log copying or replay. Note that lagged copies aren't considered behind if they're within ten minutes in replaying their logs to their lag period.
Failure of the system to accurately know the current log generation of the active copy.
Because it's a top priority for administrators to know when they're down to a single healthy copy of a database, the CheckDatabaseRedundancy.ps1 script has been replaced with integrated, native functionality that's part of managed availability.
The native functionality still alerts administrators through event log notifications, and to distinguish Exchange 2013 alerts from Exchange 2010, Exchange 2013 uses the following Event IDs:
Event 4138 (Red Alert)
Event 4139 (Green Alert)
In Exchange 2013, the native functionality has been enhanced to reduce the level of alert noise that can occur when multiple databases on the same server enter into a single copy condition. In Exchange 2010, single copy alerts were generated on a per-database level. As a result, when there was a server-wide issue that affected multiple databases and multiple database copies, alert storms could occur. Because several failures, such as controller or memory problems, are server-wide, there was a moderately high probability that such an alert storm would occur for each server incident. In Exchange 2013, alerts are now generated on a per-server basis. When an outage affects an entire server and data redundancy becomes at risk for multiple database copies, a single alert per server is now generated.
A DAG network is a collection of one or more subnets used for either replication traffic or MAPI traffic. Each DAG contains a maximum of one MAPI network and zero or more replication networks. In Exchange 2010, the initial DAG networks (for example, DAGNetwork01 and DAGNetwork02) were created by the system based on the subnets enumerated by the Cluster service. In environments where multiple networks are used and the interfaces for a specified network (for example, the MAPI network) were on the same subnet, there was little additional configuration that an administrator needed to perform. However, in environments where the interfaces for a specified network were on multiple subnets, the administrator had to perform a task referred to as collapsing DAG networks.
In Exchange 2013, collapsing DAG networks is no longer necessary. Exchange 2013 still uses the same detection mechanisms to distinguish between the MAPI and replication networks, but it now automatically collapses DAG networks as appropriate.
In addition, by default, DAG networks are now automatically managed by the system. To view DAG network properties using the Exchange Administration Center (EAC), you must configure the DAG for manual network control by modifying the properties of the DAG using EAC, or by using the Set-DatabaseAvailabilityGroup cmdlet to set the ManualDagNetworkConfiguration parameter to
Best copy selection (BCS) is an internal algorithm process for finding the best copy of an individual database to activate, given a list of potential copies for activation and their health and status. Active Manager selects the best available (and unblocked) copy to become the new active database copy when the existing active database copy fails or when an administrator performs a targetless switchover. In Exchange 2010, the BCS process evaluated several aspects of each database copy to determine the best copy to activate. These included:
Copy queue length
Replay queue length
Content index status
In Exchange 2013, Active Manager performs the same BCS checks and phases to determine replication health, but it now also includes the use of a constraint of the decreasing order of health states. As a result of these changes, BCS is now called best copy and server selection (BCSS).
BCSS includes several new health checks that are part of the built in managed availability monitoring components in Exchange 2013. There are four new additional checks performed by Active Manager (listed in the order in which they're performed):
- All Healthy Checks for a server hosting a copy of the affected database that has all monitoring components in a healthy state.
- Up to Normal Healthy Checks for a server hosting a copy of the affected database that has all monitoring components with Normal priority in a healthy state.
- All Better than Source Checks for a server hosting a copy of the affected database that has monitoring components in a state that's better than the current server hosting the affected copy.
- Same as Source Checks for a server hosting a copy of the affected database that has monitoring components in a state that's the same as the current server hosting the affected copy.
If BCSS is invoked as a result of a failover that's triggered by a managed availability monitoring component (for example, via a Failover responder), an additional mandatory constraint is enforced where the target server's component health must be better than the server on which the failover occurred. For example, if a failure of Microsoft Office Outlook Web App triggers a managed availability failover via a Failover responder, BCSS must select a server hosting a copy of the affected database on which Outlook Web App is healthy.
Cumulative Update 2 (CU2) for the Release to Manufacturing (RTM) version of Exchange 2013 contains a new service on Mailbox servers that are members of a DAG. This service is called the Microsoft Exchange DAG Management Service (MSExchangeDAGMgmt). This new service contains internal DAG monitoring functionality that previously executed inside the Microsoft Exchange Replication service (MSExchangeRepl).
Although Exchange 2013 continues to use DAGs and Windows Failover Clustering for Mailbox server role high availability and site resilience, site resilience isn't the same in Exchange 2013. Site resilience is much better in Exchange 2013 because it has been simplified. The underlying architectural changes that were made in Exchange 2013 have significant impact on the recovery aspects of a site resilience configuration.
In Exchange 2010, mailbox (DAG) and client access (Client Access server array) recovery were tied together. If you lost all of your Client Access servers, the VIP for the array, or a significant portion of your DAG, you were in a situation where you needed to do a datacenter switchover. This is a well-documented and generally well-understood process, although it takes time to perform, and requires human intervention to begin the process.
In Exchange 2013, if you lose your Client Access server array for whatever reason (for example, the load balancer fails), you don't need to perform a datacenter switchover. With the proper configuration, failover happens at the client level and clients are automatically redirected to a second datacenter that has operating Client Access servers, and those operating Client Access servers proxy the communication back to the user's Mailbox server, which remains unaffected by the outage (because you don't do a switchover). Instead of working to recover service, the service recovers itself and you can focus on fixing the core issue (for example, replacing the failed load balancer).
Furthermore, with the namespace simplification, consolidation of server roles, de-coupling of Active Directory site server role requirements, separation of Client Access server array and DAG recovery, and load balancing changes, there are changes in Exchange 2013 that now enable both Client Access server and DAG recovery to be separate and automatic across sites, thereby providing datacenter failover scenarios, if you have three locations.
In Exchange 2010, you could deploy a DAG across two datacenters and host the witness in a third datacenter and enable failover for the Mailbox server role for either datacenter. But you didn't get failover for the solution itself, because the namespace still needed to be manually changed for the non-Mailbox server roles.
In Exchange 2013, the namespace doesn't need to move with the DAG. Exchange leverages fault tolerance built into the namespace through multiple IP addresses, load balancing (and if need be, the ability to take servers in and out of service). Modern HTTP clients work with this redundancy automatically. The HTTP stack can accept multiple IP addresses for a fully qualified domain name (FQDN), and if the first IP address it tries fails hard (that is, it can't connect), it will try the next IP address in the list. In a soft failure (connection is lost after the session is established, perhaps due to an intermittent failure in the service where, for example, a device is dropping packets and needs to be taken out of service), the user might need to refresh their browser.
This means the namespace is no longer a single point of failure as it was in Exchange 2010. In Exchange 2010, perhaps the biggest single point of failure in the messaging system is the FQDN that you give to users because it tells the user where to go. In the Exchange 2010 paradigm, changing where that FQDN goes isn't easy because you have to change DNS, and then handle DNS latency, which in some parts of the world is challenging. And you have name caches in browsers that are typically about 30 minutes or more that also have to be handled.
One of the changes in Exchange 2013 is to enable clients to have more than one place to go. Assuming the client has the ability to use more than one place to go (almost all the client access protocols in Exchange 2013 are HTTP based (examples include Outlook, Outlook Anywhere, EAS, EWS, OWA, and EAC), and all supported HTTP clients have the ability to use multiple IP addresses), thereby providing failover on the client side. You can configure DNS to hand multiple IP addresses to a client during name resolution. The client asks for mail.contoso.com and gets back two IP addresses, or four IP addresses, for example. However many IP addresses the client gets back will be used reliably by the client. This makes the client a lot better off because if one of the IP addresses fails, the client has one or more other IP addresses to try to connect to. If a client tries one and it fails, it waits about 20 seconds and then tries the next one in the list. Thus, if you lose the VIP for the Client Access server array, recovery for the clients happens automatically, and in about 21 seconds.
The benefits include the following:
In Exchange 2010, if you lose the load balancer in your primary datacenter and you don't have another one in that site, you had to do a datacenter switchover. In Exchange 2013, if you lose the load balancer in your primary site, you simply turn it off (or maybe turn off the VIP) and repair or replace it. Clients that aren't already using the VIP in the secondary datacenter will automatically fail over to the secondary VIP without any change of namespace, and without any change in DNS. Not only does that mean you no longer have to perform a switchover, but it also means that all of the time normally associated with a datacenter switchover recovery isn't spent. In Exchange 2010, you had to handle DNS latency (hence, the recommendation to set the Time to Live (TTL) to 5 minutes, and the introduction of the failback URL). In Exchange 2013, you don't need to do that because you get fast failover (20 seconds) of the namespace between VIPs (datacenters).
Because you can fail over the namespace between datacenters, all that's needed to achieve a datacenter failover is a mechanism for failover of the Mailbox server role across datacenters. To get automatic failover for the DAG, you simply architect a solution where the DAG is evenly split between two datacenters, and then place the witness server in a third location so that it can be arbitrated by DAG members in either datacenter, regardless of the state of the network between the datacenters that contain the DAG members.
In this scenario, the administrator's efforts are geared toward simply fixing the problem, and not spent restoring service. You simply fix the thing that failed; while service has been running and data integrity has been maintained. The urgency and stress level you feel when fixing a broken device is nothing like the urgency and stress you feel when you're working to restore service. It's better for the end user, and less stressful for the administrator.
You can allow failover to occur without having to perform switchbacks (sometimes mistakenly referred to as failbacks). If you lose Client Access servers in your primary datacenter and that results in a 20 second interruption for clients, you might not even care about failing back. At this point, your primary concern would be fixing the core issue (for example, replacing the failed load balancer). After it's back online and functioning, some clients will start using it, and other clients might remain operational through the second datacenter.
Exchange 2013 also provides functionality that enables administrators to deal with intermittent failures. An intermittent failure is where, for example, the initial TCP connection can be made, but nothing happens afterward. An intermittent failure requires some sort of extra administrative action to be taken because it might be the result of a replacement device being put into service. While this repair process is occurring, the device might be powered on and accepting some requests, but not really ready to service clients until the necessary configuration steps are performed. In this scenario, the administrator can perform a namespace switchover by simply removing the VIP for the device being replaced from DNS. Then during that service period, no clients will be trying to connect to it. After the replacement process has completed, the administrator can add the VIP back to DNS, and clients will eventually start using it.
For details about planning and deploying site resilience, see Planning for High Availability and Site Resilience and Deploying High Availability and Site Resilience.
Exchange 2013 also includes a third-party replication API that enables organizations to use third-party synchronous replication solutions instead of the built-in continuous replication feature. Microsoft supports third-party solutions that use this API, provided that the solution provides the necessary functionality to replace all native continuous replication functionality that's disabled as a result of using the API. Solutions are supported only when the API is used within a DAG to manage and activate mailbox database copies. Use of the API outside of these boundaries isn't supported. In addition, the solution must meet the applicable Windows hardware support requirements. (Test validation isn't required for support.)
When deploying a solution that uses the built-in third-party replication API, be aware that the solution vendor is responsible for primary support of the solution. Microsoft supports Exchange data for both replicated and non-replicated solutions. Solutions that use data replication must adhere to the Microsoft support policy for data replication, as described in Microsoft Knowledge Base article 895847, Multi-site data replication support for Exchange Server. In addition, solutions that utilize the Windows Failover Cluster resource model must meet Windows cluster supportability requirements as described in Microsoft Knowledge Base article 943984, The Microsoft Support Policy for Windows Server 2008 or Windows Server 2008 R2 Failover Clusters or The Microsoft Support Policy for Windows Server 2012 Failover Clusters.
Microsoft’s backup and restore support policy for deployments that use third-party replication API-based solutions is the same as for native continuous replication deployments.
If you're a partner seeking information about the third-party API, contact your Microsoft representative.
The following table contains links to topics that will help you learn about and manage DAGs, mailbox database copies, and backup and restore for Exchange 2013.
Learn about DAGs, Active Manager, Datacenter Activation Coordination (DAC) mode, and mailbox database copies.
Learn about the general, hardware, network, software, witness server, and other requirements and best practices for DAGs.
Explore an example deployment scenario for deploying and configuring DAGs.
Learn about DAG management tasks, switchovers and failovers, and maintenance mode.
Learn about the built-in cmdlets and scripts for monitoring DAGs and database copies.
Learn about backing up and restoring Exchange databases, recovery databases, and server recovery.