Local Continuous Replication

Microsoft Exchange Server 2007 will reach end of support on April 11, 2017. To stay supported, you will need to upgrade. For more information, see Resources to help you upgrade your Office 2007 servers and clients.

 

Applies to: Exchange Server 2007, Exchange Server 2007 SP1, Exchange Server 2007 SP2, Exchange Server 2007 SP3

Local continuous replication (LCR) is a single-server solution that uses built-in asynchronous log shipping and log replay technology to create and maintain a copy of a storage group on a second set of disks that are connected to the same server as the production storage group. The production storage group is referred to as the active copy, and the copy of the storage group maintained on the separate set of disks is referred to as the passive copy. The following figure illustrates a basic deployment of LCR.

Basic deployment of LCR

Basic Architecture of Local Continuous Replication

LCR provides log shipping, log replay, and a quick manual switch (referred to as activation) to a secondary copy of the data. LCR is designed to reduce the total cost of ownership for Microsoft Exchange Server 2007 by:

  • Reducing the recovery time for data-level disasters by enabling a quick switch to a second online copy of the data.

  • Reducing the number of regular full backups that are required for data protection. Data backups are critical to have when a disaster strikes. Although LCR does not eliminate the need to take backups, it does significantly reduce the need to take regular, daily full backups.

  • Enabling you to offload Volume Shadow Copy Service (VSS) backups from the active copy of a storage group to the passive copy of the storage group. All four VSS backup types (full, copy, incremental, and differential) can be taken from the passive copy., Offloading the backups from the active copy to the passive copy preserves valuable disk input/output (I/O) on the active copy's logical unit numbers (LUNs).

LCR enables the configuration, operation, verification, removal, and activation of a storage group copy. When necessary, a passive copy can be activated as a production database, and then mounted and made available to clients. Typically, you can do this task as a configuration change either by changing the active storage group and database paths or by a lower-level operating system action (for example, changing the mount points associated with the log or database volumes).

LCR does not have any special storage requirements. Any type of storage that is supported by Windows Server 2003 or Windows Server 2008 can be used with LCR, including direct attached storage, serially attached SCSI, and Internet SCSI (iSCSI). For a list of certified storage solutions, see the Windows Server Catalog of Tested Products.

LCR is an excellent option for customers that need fast recovery from mailbox data failure or corruptions but can permit server outages for scheduled and unscheduled reasons. LCR provides:

  • Rapid, two-step recovery from corruption or failure of a production database.

  • Protection for the users that need it most.

  • Minimal impact to production database and log disk I/O.

  • The ability to offload VSS backups to the passive copy of the database and logs.

  • The ability to reduce the total amount of data moved to backup media, while extending the backup window.

  • Administration available via the Exchange Management Console or the Exchange Management Shell.

Enhancements to LCR in Exchange 2007 SP1

Microsoft Exchange Server 2007 Service Pack 1 (SP1) includes several enhancements for LCR, including use of the transport dumpster, added Exchange Management Console user interface elements, improved status and monitoring, and improved performance.

Transport Dumpster Enabled for LCR

The transport dumpster feature of the Hub Transport server role has been extended in Exchange 2007 SP1 to support LCR. In the release to manufacturing (RTM) version of Microsoft Exchange Server 2007, the transport dumpster was available only for cluster continuous replication (CCR) environments. Unlike CCR, in which the request for transport dumpster redelivery is an automatic part of the recovery process, in an LCR environment, the process is manual. The Restore-StorageGroupCopy cmdlet has been updated in Exchange 2007 SP1 to include the transport dumpster resubmission request. Thus, when an administrator activates a passive copy of a storage group in an LCR environment using the Restore-StorageGroupCopy cmdlet, the transport dumpster submission request occurs as part of the activation process.

The transport dumpster takes advantage of the redundancy in the environment to reclaim some of the data affected by the failover. Specifically, Hub Transport servers maintain a queue of recently delivered mail. This queue is bound by the amount of time mail is kept and the total space used. New functionality has been added to the Restore-StorageGroup task so that when an administrator uses that task to activate the passive copy of a storage group, the Microsoft Exchange Replication service requests redelivery of messages in the transport dumpster from each Hub Transport server in the Mailbox server's site. The information store automatically deletes the duplicates and redelivers mail that was lost.

In Exchange 2007 SP1, the necessary condition for an e-mail message to be retained in the transport dumpster is that it has at least one recipient whose mailbox is on a clustered mailbox server in a CCR environment, or on a stand-alone server in a storage group that has been configured for LCR.

Situations in which data loss is not mitigated by the transport dumpster include:

  • Drafts folder for any Microsoft Outlook clients in Online Mode.

  • Appointments, contact updates, property updates, tasks, and task updates.

  • Outgoing mail that is in transit from the client to the Hub Transport server. There is a period of time during which the e-mail message only exists on the sender's Mailbox server.

For detailed steps about how to configure the transport dumpster settings, see How to Configure the Transport Dumpster for Local Continuous Replication.

Exchange Management Console Enhancements

Several new user interface elements have been added in Exchange 2007 SP1that enhance the management experience for high availability features, including LCR. These improvements include:

  • Transport dumpster user interface   A new Global Settings tab has been added to the Hub Transport node under the Organization Configuration work area. This tab includes a Transport Settings Properties page that can be used to configure the transport dumpster settings for the organization:

    • Maximum size per storage group (MB)   Specifies the maximum size of the transport dumpster queue for each storage group.

    • Maximum retention time (days)   Specifies how long an e-mail message should remain in the transport dumpster queue.

  • Manage continuous replication   Additional user interface controls have been added to the Exchange Management Console that enable an administrator to suspend, resume, update, and restore continuous replication. These controls are the equivalent of using the following Exchange Management Shell cmdlets:

    • Suspend-StorageGroupCopy

    • Resume-StorageGroupCopy

    • Update-StorageGroupCopy

    • Restore-StoreGroupCopy

    You can use these cmdlets and the corresponding Exchange Management Console tasks to manage continuous replication in both an LCR environment and in a CCR environment.

Status and Monitoring Enhancements

Exchange 2007 SP1 also introduces several changes that are designed to enhance the manageability of Exchange 2007. These changes improve upon the cluster reporting features in Exchange 2007 RTM and include additional functionality designed for proactive monitoring of continuous replication environments. Specifically, the changes and enhancements correct known deficiencies with the Get-StorageGroupCopyStatus cmdlet, introduce a new cmdlet called Test-ReplicationHealth, and provide greater visibility into the loss window covered by the transport dumpster.

Improvements to the Get-StorageGroupCopyStatus Cmdlet

In Exchange 2007 RTM, there are several conditions where the status reported by Get-StorageGroupCopyStatus and the continuous replication performance counters are inaccurate or misleading:

  • A storage group that is not active (for example, not changing) can report its status as healthy when it might not be healthy. This situation occurs because the unhealthy condition is not detected until a log is replayed.

  • During replication initialization, the replication status is being re-evaluated and may not be accurate. When initialization completes, the status is updated.

  • The value of the LastLogGenerated field can be wrong when the database in the storage group is dismounted.

  • When there are one or more missing logs in the middle of a log stream, the passive copy continues to try to recover, causing the replication status to switch between failed and healthy states. When this happens, the replay and copy queues continue to grow.

  • Under rare conditions, a log can be successfully verified but still fail to replay. In this situation, the system will alternate between failed and healthy states during its attempts to recover. When this happens, the replay and copy queues continue to grow.

The Get-StorageGroupCopyStatus cmdlet has also been enhanced with the addition of new status information:

  • The Get-StorageGroupCopyStatus cmdlet reports a SummaryCopyStatus of ServiceDown when the Microsoft Exchange Replication service on the target computer is not network accessible.

  • The Get-StorageGroupCopyStatus cmdlet reports a SummaryCopyStatus of Initializing when the Microsoft Exchange Replication service on the target computer has not completed its initial startup checks. A new performance counter has also been created to represent this status as a Boolean.

  • The Get-StorageGroupCopyStatus cmdlet reports a SummaryCopyStatus of Synchronizing when it has not completed incremental reseed.

The new states for the SummaryCopyStatus value are visible only when you use the Exchange 2007 SP1 version of the Exchange management tools. When you use the Exchange 2007 RTM version of the Exchange management tools, the status for any of the preceding states will be reported as failed.

Test-ReplicationHealth Cmdlet

Exchange 2007 SP1 introduces a new cmdlet called Test-ReplicationHealth. This cmdlet is designed for proactive monitoring of continuous replication and the continuous replication pipeline. The Test-ReplicationHealth cmdlet checks all aspects of replication, Cluster services, and storage group replication and replay status to provide a complete overview of the replication system. Specifically, when run on a node in the cluster, the Test-ReplicationHealth cmdlet performs the tests described in the following table.

Tests performed by the Test-ReplicationHealth cmdlet

Test Description

Cluster network status

Verifies that all cluster-managed networks found on the local node are running. This test is run only in a CCR environment.

Quorum group state

Verifies that the cluster group containing the quorum resource is healthy. This test is run only in a CCR environment.

File share quorum state

Verifies that the value of the FileSharePath used by the Majority Node Set quorum with file share witness is reachable. This test is run only in a CCR environment.

Clustered mailbox server group state

Verifies that the clustered mailbox server is healthy by confirming that all resources in the group are online. This test is run only in a CCR environment.

Node state

Verifies that neither of the nodes in the cluster is in a paused state. This test is run only in a CCR environment.

DNS registration status

Verifies that all cluster-managed network interfaces that have Require DNS registration to succeed set have passed Domain Name System (DNS) registration. This test is run only in a CCR environment.

Replication service status

Verifies that the Microsoft Exchange Replication service on the local computer is healthy.

Storage group copy suspended

Checks to see if continuous replication has been suspended for any storage groups enabled for continuous replication.

Storage group copy failed

Checks to see if any storage group copies are in a Failed state.

Storage group replication queue length

Checks to see if any storage group has a replication copy queue length greater than best practice thresholds. Currently, these thresholds are:

  • Warning   Queue length is 3–5 logs

  • Failure   Queue length is 6 or more logs

Databases dismounted after failover

Checks to see if any databases are dismounted or failed after a failover has occurred. This test only checks for databases that have failed as a result of a failover.

Performance Enhancements

Several performance improvements have been made in Exchange 2007 SP1 that benefit high availability deployments. These improvements include I/O reductions on the disks containing passive copies of storage groups in continuous replication environments. In Exchange 2007 SP1, the design of the continuous replication architecture has been modified so that the database cache is now persisted for the storage group copy in between instances of log replay activity. The persistence of the database cache between instances of log replay activity enables the Microsoft Exchange Replication service to make use of the database caching features of the Extensible Storage Engine (ESE), which in turn reduces the amount of disk I/O that occurs on the passive copy's LUNs. By contrast, in Exchange 2007 RTM, a new database cache was created for each batch of log replay activity, which in some cases made the disk I/O activity on the passive LUNs as much as two to three times the disk I/O on the active LUNs.

Using Standby Continuous Replication with LCR

Standby continuous replication (SCR) is a new feature introduced in Exchange 2007 SP1. SCR extends the existing continuous replication features and enables new data availability scenarios for Exchange 2007 Mailbox servers. SCR uses the same log shipping and replay technology used by LCR and CCR to provide added deployment options and configurations.

SCR enables you to use continuous replication to replicate Mailbox server data from a stand-alone Mailbox server (with or without LCR), or from a clustered mailbox server in a single copy cluster (SCC) or in a CCR environment.

The process for activating copies of Mailbox server data that are created and maintained by SCR is manual and is designed to be used when a significant failure occurs. The process is not meant to be used for simple server outages that are recoverable by a restart or some other quick means. You can activate an SCR target using database portability, the server recovery option (Setup /m:RecoverServer), or, if the Mailbox server is clustered, the clustered mailbox server recovery option (Setup /RecoverCMS). The option you choose will be based on your configuration and the type of failure that occurs.

For more information about SCR, see Standby Continuous Replication.

For More Information

The following topics discuss when and how to use LCR as part of a high availability and data resiliency plan: