[This is pre-release documentation and subject to change in future releases.
This topic's current status is: Ready for Tech Review.]
Applies to: Exchange Server 2010
Microsoft Exchange Server 2010 reduces the cost and complexity of deploying an e-mail solution that provides the highest levels of server availability and site resilience. Building on the native replication capabilities introduced in Exchange Server 2007, the new high availability architecture in Exchange 2010 provides a simplified, unified framework for both high availability and disaster recovery. Exchange 2010 integrates high availability into the core architecture of Exchange, enabling customers of all sizes and in all segments to be able to economically deploy a messaging continuity service in their organization.
Exchange 2007 decreased the costs of high availability and made site resilience much more economical by introducing new technologies such as local continuous replication (LCR) cluster continuous replication (CCR) and standby continuous replication (SCR). Still, some challenges remained:
-
Some administrators were intimidated by the complexity of Windows failover clustering.
-
Achieving a high level of uptime can require a high level of administrator intervention.
-
Each type of continuous replication was managed differently and separately.
-
Recovering from a failure of a single database on a large mailbox server could result in a temporary disruption of service to all users on the mailbox server.
-
Site resilience solutions were not seamless.
-
The transport dumpster feature of the Hub Transport server could only protect messages destined for mailboxes in an LCR or CCR environment. If a Hub Transport server fails while processing messages and cannot be recovered, it could result in data loss.
Exchange 2010 includes significant core changes that integrate high availability deep in its architecture, making it even less costly and easier to deploy and maintain than Exchange 2007 for all customers. Organizations can now deploy a fully-redundant Exchange organization with just two servers, and benefit from database-level failovers. Customers benefit from automatic, database-level failover capabilities without having to become experts in Windows failover clustering. Moreover, you can add site resilience to you existing high availability deployments with less complexity.
Exchange 2007 introduced many new architectural changes designed to make deploying high availability and site resilience solutions for Exchange faster and simpler. These improvements included an integrated Setup experience, optimized out-of-box configuration settings, and the ability to manage most aspects of the high availability solution using native Exchange management tools.
Still, management of an Exchange 2007 high availability solution required administrators to master some clustering concepts, such as the concept of moving network identities and managing cluster resources. In addition, when troubleshooting issues related to a clustered mailbox server, administrators has to use Exchange tools and cluster tools to review and correlate logs and events from two different sources: one from Exchange and one from the cluster.
Two other limiting aspects of the Exchange 2007 architecture have also been re-evaluated and re-engineered based on customer feedback:
-
Clustered Exchange 2007 servers require dedicated hardware. Only the Mailbox server role could be installed on a node in the cluster. This meant that a minimum of four Exchange servers were required in order to achieve full redundancy of the primary components of a deployment, i.e., the core server roles (Mailbox, Hub Transport, and Client Access).
-
In Exchange 2007, failover of a clustered mailbox server occurs at the server level. As a result, if a single database failure occurred, the administrator had to failover the entire clustered mailbox server to another node in the cluster (which resulted in brief downtime for all users on the server, and not just those users with a mailbox on the affected database), or leave the users on the failed database offline (potentially for hours) while restoring the database from backup.

Mailbox Resiliency
Exchange 2010 has been re-engineered around the concept of mailbox resiliency, in which the architecture has changed so that automatic failover protection is now provided at the individual mailbox database level instead of at the server level. In Exchange 2010, this is known as database mobility. As a result of this and other database cache architectural changes, failover actions now complete much faster than in previous versions of Exchange. For example, failover of a clustered mailbox server in a CCR environment running Exchange 2007 with Service Pack 2 completes in about 2 minutes. By comparison, failover of a mailbox database in an Exchange 2010 environment completes in 30 seconds or less (measured from the time when the failure is detected to when a database copy is mounted, assuming an available copy that is healthy and up-to-date with log replay). The combination of database-level failovers and significant faster failover times dramatically improves an organization's overall uptime.
The mailbox resiliency architecture built into Exchange 2010 provides new benefits for organizations and their messaging administrators:
-
Multiple server roles can co-exist on servers that provide high availability. This enables small organizations to deploy a two-server configuration provides redundancy of mailbox data and service, while also providing redundant Client Access and Hub Transport services.
-
An administrator no longer needs to build a failover cluster in order to achieve high availability. Failover clusters are now created by Exchange 2010 in a way that is invisible to the administrator. Unlike previous versions of Exchange clusters which used an Exchange-provided cluster resource DLL named ExRes.dll, Exchange 2010 no longer needs or uses a cluster resource DLL. Exchange 2010 is not a clustered application, and it uses only a small portion of the failover cluster components, namely, its heartbeat capabilities and the cluster database, in order to provide database mobility.
-
Administrators can add high availability to their Exchange 2010 environment after Exchange has been deployed, without having to uninstall Exchange and then re-deploy in a highly availability configuration.
-
Exchange 2010 provides a view of the event stream that coalesces and combines the events from the operating system with the events from Exchange.
-
Because storage group objects no longer exist in Exchange 2010, and because mailbox databases are portable across all Exchange 2010 Mailbox servers, it is very easy to move databases when needed.
For more information, see High Availability and Site Resilience.

Changes to High Availability from Previous Versions of Exchange
Exchange 2010 includes many changes to its core architecture. Exchange 2010 combines the key availability and resilience features of CCR and SCR into single high availability solution which handles both on-site data replication and off-site data replication. Mailbox servers can be defined as part of a database availability group to provide automatic recovery at the individual mailbox database level instead of at the server level. Each mailbox database can have up to 16 copies. Other new high availability concepts are introduced in Exchange 2010, such as database mobility, and incremental deployment. The concepts of a backup-less and RAID-less organization are also being introduced in Exchange 2010.
In a nutshell, the key aspects to data and service availability for the Mailbox server role and mailbox databases are:
-
Exchange 2010 uses an enhanced version of the same continuous replication technology introduced in Exchange 2007. See the section below entitled "Changes to Continuous Replication from Exchange Server 2007" for more information.
-
Storage groups no longer exist in Exchange 2010. Instead, there are simply mailbox databases and mailbox database copies, and public folder databases. The primary management interfaces for Exchange databases has moved within the Exchange Management Console from the Mailbox node under Server Configuration to the Mailbox node under Organization Configuration.
-
Some Windows Failover Clustering technology is used by Exchange 2010, but it is now completely managed under-the-hood by Exchange. Administrators do not need to install, build or configure any aspects of failover clustering when deploying highly available Mailbox servers.
-
Each mailbox server can host as many as 100 databases, and each database can have as many as 16 copies.
-
In addition to the transport dumpster feature, a new Hub Transport server feature named shadow redundancy has been added. Shadow redundancy provides redundancy for messages for the entire time they are in transit. The solution involves a technique similar to the transport dumpster. With shadow redundancy, the deletion of a message from the transport database is delayed until the transport server verifies that all of the next hops for that message have completed delivery. If any of the next hops fail before reporting back successful delivery, the message is resubmitted for delivery to that next hop. For more information about shadow redundancy, see Understanding Shadow Redundancy.

Incremental Deployment
In previous versions of Microsoft Exchange, service availability for the Mailbox server roles was achieved by deploying Exchange in a Windows failover cluster. To deploy Exchange in a cluster, you had to first build a failover cluster, and then install the Exchange program files. This process created a special mailbox server called a clustered mailbox server (or Exchange Virtual Server in really old versions of Microsoft Exchange). If you had already installed the Exchange program files on a non-clustered server and you decided you wanted a clustered mailbox server, you had to build a cluster using new hardware, or remove Exchange from the existing server, install failover clustering, and reinstall Exchange.
Exchange 2010 introduces the concept of incremental deployment, which enables you to deploy service and data availability for all Mailbox servers and databases after Exchange is installed. Service and data redundancy is achieved by using new features in Exchange 2010 such as database availability groups and database copies.

Database Availability Groups
A database availability group (DAG) is a set of up to 16 Mailbox servers that provide automatic database-level recovery from failures that affect individual databases. Any server in a DAG can host a copy of a mailbox database from any other server in the DAG. When a server is added to a DAG, it works with the other servers in the DAG to provide automatic recovery from failures that affect mailbox databases, such as a disk failure or server failure.
For more information about DAGs, see Understanding Database Availability Groups.

Mailbox Database Copies
The high availability and site resilience features first introduced in Exchange 2007 are used in Exchange 2010 to create and maintain database copies, thereby enabling you to achieve your availability goals in Exchange 2010. Exchange 2010 also introduces the new concept of database mobility, which is Exchange-managed database-level failovers.
Database mobility disconnects databases from servers and adds support for up to 16 copies of a single database and it provides a native experience for adding database copies to a database. In Exchange 2007, a feature called database portability also enabled you to move a mailbox database between servers. A key distinction between database portability and database mobility, however, is that all copies of a database have the same GUID.
Other key characteristics of database mobility are:
-
Because storage groups have been removed from Exchange 2010, continuous replication now operates at the database level. In Exchange 2010, transaction logs are replicated to one or more other Mailbox servers, and replayed into a copy of a mailbox database that is stored on those servers.
-
A failover is an automatic activation process that can occur at either the database level or at the server level. A switchover is a manual activation process that you can perform at the database, server, or datacenter (site) level.
-
Database names for Exchange 2010 must be unique within the Exchange organization.
-
When a mailbox database has been configured with one or more database copies, the full path for all database copies must be identical on all Mailbox servers that host a copy.
-
Any mailbox database copy (the active or any passive copy) can be backed up using an Exchange-aware VSS-based backup application.
For more information about mailbox database copies, see Understanding Mailbox Database Copies.

Changes to Continuous Replication from Exchange Server 2007
The underlying continuous replication technology previously found in CCR and SCR remains in Exchange 2010, and it has been further evolved to support new high availability features such as database copies, database mobility, and database availability groups. Some of these new architectural changes are briefly described below:
-
Because storage groups have been removed from Exchange 2010, continuous replication now operates at the database level. Exchange 2010 still uses an Extensible Storage Engine (ESE) database that produces transaction logs which are replicated to one or more other locations and replayed into one or more copies of a mailbox database.
-
Because the log replay functionality that was performed by the Microsoft Exchange Replication service in Exchange 2007 has been moved into the Exchange 2010 Information Store service (store.exe), the performance hit associated with failovers and switchovers (because a new database cache was put into use) no longer exists. When a failover or switchover occurs, the activated database has a warm cache that is ready for use.
-
Log shipping and seeding no longer uses Server Message Block (SMB) for data transfer. Exchange 2010 continuous replication uses a single administrator-defined TCP port for data transfer. In addition, Exchange 2010 includes built-in options for network encryption and compression for the data stream.
-
Log shipping no longer uses a pull model, where the passive copy pulls a closed log files from the active copy. Instead, the active copy pushes the log files to each configured passive copy.
-
Seeding is no longer restricted to using only the active copy of the database. Passive copies of mailbox databases can now be specified as sources for database copy seeding and re-seeding.
-
Database copies are for mailbox databases only. For redundancy and high availability of public folder databases, we recommend that you use public folder replication. Unlike CCR, where multiple copies of a public folder database could not exist in the same cluster, you can use public folder replication to replicate public folder databases between servers in a DAG.
Several concepts used in Exchange 2007 continuous replication also remain in Exchange 2010. These include the concepts of failover management, divergence, the use of the auto database mount dial, and the use of public and private networks.

End-to-End Availability
Exchange 2010 also includes many features designed to increase end-to-end availability of the system. These features include:
-
Transport Resilience
-
Online Move Mailbox
-
Using Replication for your Backups
-
Incremental Resync
-
Page Patching
-
Third Party Replication API

Transport Resilience
Exchange 2007 introduced the Transport Dumpster feature of the Hub Transport server. The Transport Dumpster maintains a queue of messages that were delivered to recipients whose mailbox was in a CCR (and in Exchange 2007 SP1, in an LCR) environment. This feature was designed to help protect against data loss by providing an administrator with the option to have a clustered mailbox server (CMS) automatically come online on another node with a limited amount of data loss. This is referred to as a lossy failover. When a lossy failover occurred, the system automatically re-delivered the recent e-mail messages sent to users on the failed CMS, by using the transport dumpster where the e-mail messages were still stored. While this solution helped to minimize the amount of data lost in a lossy failover, the solution only protected from data loss within a site, and it did not provide protection for messages in transit.
Exchange 2010 introduces core architectural changes that address both issues. Because DAGs can be stretched across Active Directory sites, it is possible for an individual mailbox database to move between Active Directory sites. Because of this design change, the transport dumpster re-delivery request upon a lossy database failover is now issues to Hub Transport servers in both the database's original and new Active Directory sites.
One other significant change to the Transport Dumpster is that it now receives feedback from the replication pipeline. When messages in the Transport Dumpster have been replicated to all mailbox database copies, they're removed from the Transport Dumpster. This ensures that only non-replicated data will be held in the Transport Dumpster.
In addition to the transport dumpster feature, a new Hub Transport server feature named shadow redundancy has been added. Shadow redundancy provides redundancy for messages for the entire time they are in transit. The solution involves a technique similar to the transport dumpster. With shadow redundancy, the deletion of a message from the transport database is delayed until the transport server verifies that all of the next hops for that message have completed delivery. If any of the next hops fail before reporting back successful delivery, the message is resubmitted for delivery to that next hop. For more information about shadow redundancy, see Understanding Shadow Redundancy.

Online Move Mailbox
Exchange 2010 includes a new feature that enables you to move mailboxes asynchronously. In Exchange 2007, when you used the Move-Mailbox cmdlet to move a mailbox, the cmdlet logged into both the source database and the target database and moved the content from one mailbox to the other mailbox. There were several disadvantages to having the cmdlets perform the move operation:
-
Mailbox moves typically took hours to complete, and during the move, the user was not able to access their mailbox.
-
If the command window used to run Move-Mailbox was closed, the move was terminated and had to be restarted from the beginning.
-
The computer used to perform the move participated in the data transfer. If an administrator ran the cmdlets from their workstation, the mailbox data would flow from the source server to the administrator's workstation and then to the target server.
The new Move Request cmdlets in Exchange 2010 can be used to perform asynchronous moves. Unlike Exchange 2007, the cmdlets do not perform the actual move. The move is performed by the Microsoft Exchange Mailbox Replication Service (MRS), a new service that runs on Client Access server. The New-MoveRequest cmdlet sends requests to the Mailbox Replication Service. For more information about online move mailbox, see Understanding Move Requests.

Using Replication for your Backups
There are several changes to the core architecture of Exchange 2010 that have a direct effect on how you will protect your mailbox databases and the mailboxes they contain.
One significant change is the removal of storage groups. In Exchange 2010, each database is associated with a single log stream, represented by a series of 1 megabyte (MB) log files. Each server can host a maximum of 100 databases.
Another significant change for Exchange 2010 is that databases are no longer closely tied to a specific Mailbox server. Database mobility expands the system's use of continuous replication by replicating a database to multiple, different servers. This provides better protection of the database and increased availability. In the case of failures, the other servers that have copies of the database can mount the database.
The ability to have multiple copies of a database hosted on multiple servers, means that if you have a sufficient number of database copies, you can use these copies as your backups. For more information on this strategy, see Understanding Backup, Restore and Disaster Recovery.

Incremental Resync
Exchange 2007 introduced the concepts of lost log resilience (LLR) and incremental reseed. LLR is an internal component of ESE that enables you to recover Exchange mailbox databases even if one or more of the most recently generated transaction log files have been lost or damaged. LLR enables a mailbox database to mount even when recently generated log files are unavailable. LLR works by delaying writes to the database until the specified number of log generations have been created. LLR delays recent updates to the database file for a short time. The length of time that writes are delayed depends on how quickly logs are being generated.
Note: |
|---|
|
LLR is hard-coded to 1 log file for all Exchange 2010 mailbox databases.
|
Incremental reseed provided the ability to correct divergences in the transaction log stream between a source and target storage group, by relying on the delayed replay capabilities of LLR. Incremental reseed did not provide a means to correct divergences in the passive copy of a database, once divergent logs had been replayed, which forced the need for a complete reseed.
In Exchange 2010, incremental resync is the new name for the feature that automatically corrects divergences in database copies under the following conditions:
-
After an automatic failover for all of the configured copies of a database
-
When a new copy is enabled and some database and log files already exist at the copy location
-
When replication is resumed following a suspension or restarting of the Microsoft Exchange Replication service.

Page Patching
The ability to correct additional causes of divergence is provided by a new ESE mechanism known as page patching. When divergence between an active database and a copy of that database is detected, incremental resync performs the following tasks:
-
It searches historically in the log file stream to locate the point of divergence
-
It locates the changed database pages on the diverged copy
-
It reads the changed pages from the active copy and then copies the necessary log files from the active copy
-
It applies the database page changes to the diverged copy
-
It runs recovery on the diverged copy and replays the necessary log files into the database copy

Third Party Replication API
Exchange 2010 also includes a new third party replication API that enables organizations to use third party synchronous replication solutions instead of the built-in continuous replication feature. For information about partner products for Exchange 2010, see http://www.microsoft.com/exchange/2010/partners/default.mspx. If you are partner seeking information on the third party API, please contact your Microsoft representative.

Deprecated Features from Exchange Server 2007
The following features in Exchange 2007 and Exchange 2007 SP1 no longer exist in Exchange 2010:
-
Cluster continuous replication (CCR)
-
Standby continuous replication (SCR)
-
Local continuous replication (LCR)
-
Single copy clusters (SCC)
-
Clustered mailbox servers
-
Storage groups
-
Recovery Storage Group