Planning for Local Continuous Replication

Microsoft Exchange Server 2007 will reach end of support on April 11, 2017. To stay supported, you will need to upgrade. For more information, see Resources to help you upgrade your Office 2007 servers and clients.

 

Applies to: Exchange Server 2007, Exchange Server 2007 SP1, Exchange Server 2007 SP2, Exchange Server 2007 SP3

Planning for local continuous replication (LCR) involves designing a storage group and database topology, and ensuring adequate storage solution support and adequate monitoring of LCR.

Storage Requirements and Recommendations for LCR

LCR includes some storage requirements and recommendations. When designing your LCR storage solution, include additional input/output (I/O) usage for LCR because the LCR environment involves the log updates by the active copy and similar log reads on the passive copy. We recommend the storage be designed so that the passive copy has the similar capacity and performance capability as the active copy. When using LCR, we recommend that you follow these best practices:

  • Use a single database per storage group   When a storage group has been enabled for LCR, it can only contain a single database. In addition, if an existing storage group has multiple databases, you cannot enable LCR for that storage group until you have removed all but one database. This approach creates a more manageable Microsoft Exchange storage topology that increases recoverability.

  • Use volume mount points   You can use drive letters or volume mount points for your Exchange data logical unit numbers (LUNs) or disks for designating where database files and transaction log files are stored. We recommend that you use the NTFS file system volume mount points feature to surpass the 26-drive–letter limitation that exists per Exchange Server. By using volume mount points, you can graft, or mount, a target partition into a folder on another physical disk. Volume mount points are transparent to programs, including Exchange Server. Using a volume mount point simplifies the recovery process when corruption is detected in the production transaction logs or database files by allowing you to quickly change drive letter assignments and paths. For more information about recovering from corruption in production transaction log or database files, see Managing Local Continuous Replication.

  • Partition data for performance and recovery   In general, partitioning your data across multiple hard disks can increase performance and reduce the amount of data that you need to recover. Depending on the type of failure, placing databases and transaction log files on separate disks can minimize data loss significantly. For example, if you keep your Exchange databases and transaction log files on the same physical hard disk and that disk fails, you can recover only the data that existed at your last backup. Alternatively, consider that you placed your log files and database files on separate disks. If the disk containing the database files fails, you can recover your data from the log files present on the separate disk. To optimize performance, increase fault tolerance, and make troubleshooting easier, you should partition your data so that the following files are located on separate disks:

    • Microsoft Windows operating system files

    • Exchange Server application files

    • Exchange database files on the active copy

    • Exchange transaction log files on the active copy

    • Exchange database files on the passive copy

    • Exchange transaction log files on the passive copy

    In addition, you should place the passive copies of your LCR-enabled storage groups on disks that are isolated from the active copies of your LCR-enabled storage groups. Furthermore, you should make sure that the disks containing the LCR files have the same performance capabilities as the disks that contain the production storage group. This equivalence allows the LCR copy to support the load in the case of a failover.

  • Ensure sufficient disk space   The disks containing the LCR files should be sized comparably to the production volumes. The storage used by the passive copies should be equivalent to the storage used for the active copies. In addition, both storage solutions need to include enough space to accommodate the size of the existing database, plus any anticipated database growth.

  • Ensure sufficient bandwidth and low latency when using iSCSI storage with LCR   Although not recommended, it is supported to configure LCR using Internet SCSI (iSCSI) storage that is connected to the Mailbox server over a local area network (LAN) or wide area network (WAN) link. In this configuration, both log shipping and log replay activity would be occurring over the same storage network. The primary reason this configuration is discouraged is due to the network traffic generated by log shipping. For LCR to provide the expected level of protection, it is critical that log shipping stay up to date, and that the network traffic associated with log shipping does not consume so much bandwidth that it interferes with the network traffic associated with log replay activity. There is no method to prioritize replication traffic. In addition, there are some storage requirements that must be considered:

    • In the release to manufacturing (RTM) version of Microsoft Exchange Server 2007, the storage for the passive copies of the databases must provide from two to three times the I/O per second (IOPS) as the storage used for the active copies of the databases.

    • In Microsoft Exchange Server 2007 Service Pack 1 (SP1), the storage for the passive copies can be equivalent of the active copies.

Note

You can use the Microsoft Exchange Server Jetstress tool to validate your storage solution prior to putting it into production. We recommend that you first validate the storage being used for the active copies of the databases, and then validate the storage used for the passive copies of the database. For more information about Jetstress, see "Storage-Related Tools" in Storage Validation.

Processor and Memory Recommendations for LCR

For a Mailbox server enabled for LCR that has all of the Microsoft Exchange Server 2007 Mailbox server role services as well as the Microsoft Exchange Replication service running on the same server, additional hardware resources will need to be available to handle this additional load. The majority of the additional resource consumption comes from log file verification and log file replay on the LCR-enabled Mailbox server. This additional processing cost is approximately 20 percent (beyond the processor guidance listed in Planning Processor Configurations) and should be considered when sizing LCR Mailbox servers. Additionally, the Microsoft Exchange Replication service will work well on an LCR server based on the provided memory resources. However, to make sure that the Extensible Storage Engine (ESE) database cache maintains optimal efficiency under LCR, we recommend that you provision an additional 1 gigabyte (GB) of physical RAM to Exchange Mailbox servers and servers with multiple roles (beyond the memory guidance listed in Planning Memory Configurations.

Database Size Recommendations for LCR

LCR provides much more flexibility to recover from catastrophic data loss. The first line of defense for catastrophic storage failure or physical database corruption with LCR is to activate the passive copy of the data, and not restore anything from backup. Remember that LCR offers fast data recovery, but it is not a backup solution. LCR makes it much less important to have short Recovery Time Objectives (RTOs) based on restoring from archive or tape. Instead of restoring from tape, you activate the passive copy and the data is available to clients in minutes as opposed to hours. In this sense, LCR can be considered a fast recovery mechanism, putting it in the same category as hardware-based clones created using the Volume Shadow Copy Service (VSS) in Exchange Server 2003.

It is not uncommon for an administrator to have to perform offline database operations, such as repairs, because of bad backups. (For example, a tape is bad or a restore fails.) Although the percentage of situations in which repair is necessary should decrease dramatically, there will still be times when it will be necessary. Be sure to consider your tolerance for worst case downtime when deciding on database size.

LCR allows you to make a backup from the passive copy of a storage group, which allows you to extend your online maintenance window on the active copy. In many cases, you can double the online maintenance window, which in turn allows you to have larger mailboxes and databases.

At this point, it might appear as though LCR enables you to grow your databases as large as you like without risk; however, that is not the case. Online maintenance that completes in a reasonable amount of time per database is still a limiting factor on database size. With LCR, the possibility of needing to reseed databases is also a limiting factor. LCR provides database redundancy, so if the active copy of a database is lost or corrupted, recovery can be accomplished quickly by manually activating the passive copy of the database.

After activation occurs, there remains only one copy of the database, which is the new active copy. Because the passive copy no longer exists, database resiliency may be compromised. However, you should still have your backup. To enable resiliency, the lost or corrupted database needs to be removed, and a new passive copy of the database needs to be created and reseeded from the active copy. Depending upon the size of your database, these tasks could take a long time. The worst case scenario is the loss or corruption of all active copies, where all passive copies have to be reseeded.

A larger maximum database size is possible when continuous replication is used. We recommend the following maximum database sizes for Exchange 2007:

  • Databases hosted on a Mailbox server without LCR: 100 GB

  • Databases hosted on a Mailbox server with LCR: 200 GB

    Important

    The true maximum size for your databases should be dictated by the service level agreement (SLA) in place at your organization. Determining the largest size database that can be backed up and restored within the period specified in your organization's SLA is how you determine the maximum size for your databases.

LCR and Public Folder Databases

LCR and public folder replication are two very different forms of replication built into Exchange. Due to interoperability limitations between continuous replication and public folder replication, if more than one Mailbox server in the Exchange organization has a public folder database, public folder replication is enabled and public folder databases should not be hosted in a storage group that is enabled for LCR.

The following are the recommended configurations for using public folder databases and LCR in your Exchange organization:

  • If you have a single Mailbox server in your Exchange organization, and that Mailbox server is a stand-alone server, the Mailbox server can host a public folder database in a storage group enabled for LCR. In this configuration, there is a single public folder database in the Exchange organization. Thus, public folder replication is disabled.

  • If you have multiple Mailbox servers and only one of the Mailbox servers contains a public folder database, the Mailbox server can host a public folder database in a storage group enabled for LCR. In this configuration, there is a single public folder database in the Exchange organization. Thus, public folder replication is disabled.

  • If you are migrating public folder data into a storage group enabled for LCR, you can use public folder replication to move the contents of a public folder database to a public folder database in a storage group enabled for LCR. After you create the public folder database in an LCR-enabled storage group, the additional public folder databases should only be present until your public folder data has fully replicated to the public folder database in the LCR-enabled storage group. When replication has completed successfully, all public folder databases outside of LCR-enabled storage group should be removed and you should not host any other public folder databases in the Exchange organization.

  • If you are migrating public folder data out of a storage group enabled for LCR, you can use public folder replication to move the contents of a public folder database out of the public folder database in a storage group enabled for LCR. After you create the additional public folder database outside of the storage group enabled for LCR, the public folder database in the storage group that is enabled for LCR should only be present until your public folder data has fully replicated to the additional public folder databases. When replication has completed successfully, all public folder databases inside of all LCR-enabled storage groups should be removed and all subsequent public folder databases should not be hosted in storage groups that are enabled for continuous replication.

During any period where more than one public folder database exists in the Exchange organization and one or more public folders databases are hosted in a storage group enabled for LCR, if a failure of the LCR active storage group occurs and the passive copy of the storage group with a public folder database needs to be activated, it can only be made mountable if all logs for the storage group hosting the public folder database are available. If any logs are missing or unavailable as a result of the failure of the active copy, you will not be able to activate the passive copy of the public folder database. In this event, the active copy must be brought online to ensure no data loss, or the public folder database must be re-created in the active copy of the storage group and its content must be recovered using public folder replication from a public folder database(s) that other than the passive copy.

Monitoring Recommendations for LCR

LCR is a data-availability solution. It needs to be monitored proactively. Exchange 2007 publishes a variety of status information for LCR copies. After LCR has been enabled for a storage group, you can use either the Exchange Management Console or the Exchange Management Shell to view the status and configuration settings of LCR copies. For detailed steps about how to view status and configuration information, see How to View the Status of a Local Continuous Replication Copy.

For proactive and automated monitoring, we recommend that you use Microsoft Operations Manager (MOM) and the Exchange 2007 Management Pack for MOM. For more information about monitoring LCR, see Monitoring and Operations Management.

In Exchange 2007 SP1, you can also use a new cmdlet called Test-ReplicationHealth to verify the heath and status of storage groups enabled for LCR. For more information about the Test-ReplicationHealth cmdlet, see Test-ReplicationHealth.

Backup and Restore and LCR

LCR provides log shipping, log replay, and a quick manual switch to a secondary copy of the data. These features reduce the recovery time needed for data-level disasters. LCR also reduces the number of backups that are needed for sufficient data protection. Although LCR does not eliminate the need to make backups, it does reduce the need to make daily, full backups. LCR also enables you to offload Volume Shadow Copy Service (VSS) backups from the active storage group to the passive storage group. All four backup types (Full, Copy, Incremental, and Differential) can be taken from the passive copy locations, preserving valuable disk I/O on the active copy's LUNs to serve clients.

In addition to reducing overall total cost of ownership (TCO), LCR provides additional advantages over the preceding backup solutions. LCR lets you have additional copies of your Exchange databases, which provides you with the following benefits:

  • Reduction in database backup frequency   The LCR copy is the first line of defense against a production database failure. Both the production storage group and the storage group copy would have to fail before backup copies would be required. As a result, we recommend a longer service level agreement (SLA) for this case. With a longer SLA, weekly full backups and daily incremental backups are recommended.

  • Fast recovery from disasters   Typically, the recovery occurs in less than ten minutes with little or no data loss.

  • Support for larger mailbox quotas   This support is achieved as a result of fast recovery that is independent of database size.

For more details and specific guidance for backup and restore, see Disaster Recovery.

Exchange Backups and LCR

Exchange-aware backups are supported from active storage groups and databases and from passive database copies.

Note

A common task during Exchange-aware backups is the truncation of transaction log files after the backup has completed successfully. The replication feature in LCR guarantees that logs that have not been replicated are not deleted. As a result, running backups in a mode that deletes logs may not actually free space if replication is sufficiently far behind in its log copying.

Exchange-aware backups in this configuration can either be performed using streaming or VSS backup solutions. While streaming backups can be performed only from the active copy, VSS backups can be made from either the active or the passive copy.

Exchange Restores and LCR

Exchange-aware restores can either be performed using streaming or VSS backup solutions. Restores can be targeted to the active database and log file locations. Exchange-aware restores of database backups directly to the passive copy location is not supported natively. Restores to passive copy locations can be achieved manually by a file-level restore.

Before you restore a database from a storage group that was configured for LCR, you should suspend LCR for the storage group. After the restore has completed, you can resume LCR. LCR should be suspended for databases that are being restored.

After restoring a database from backup into a storage group that is enabled for LCR, you must suspend and then resume continuous replication for the storage group using Suspend-StorageGroupCopy and Resume-StorageGroupCopy, respectively. This process is needed to update the Microsoft Exchange Replication Service with the correct log generation information. If continuous replication is not suspended and resumed, the Microsoft Exchange Replication Service will have outdated log generation information and will stop replicating log files.