
Leveraging Exchange 2007 Features to Address Challenges
The deployment of large mailboxes with Exchange 2007 involves some technical challenges from an IT pro usability perspective as well as from a deployment and operations perspective for the Exchange administrator. A complete understanding of the following areas is the key to success when implementing large mailboxes with Exchange 2007:
-
Performance and scalability
-
Storage considerations for maximum mailbox size
-
Outlook online mode vs. cached mode
-
Client usability and mailbox management
-
Client throttling
-
Rapid disaster recovery with large databases and multiple databases
-
Operations management
By understanding the challenges that large mailboxes pose, administrators can take steps to successfully mitigate potential performance issues, meet service level agreements (SLA) including RTO, maintain operational excellence, and ensure end-user satisfaction.
Performance and Scalability
Large mailboxes require a reduced input/output (I/O) footprint to be a viable option. Exchange 2007 provides I/O reduction features to meet this need by leveraging a new 64-bit architecture and introducing significant changes to the Exchange store and Extensible Storage Engine (ESE).
The larger 64-bit addressable space allows Exchange servers to utilize more memory thereby reducing the required input/output per second (IOPS), enabling the use of larger disks, and allowing lower cost storage solutions such as SATA2 drives. These changes have shown an IOPS decrease of approximately 70 percent with Exchange 2007 on 64-bit hardware when comparing servers with 4,000 user mailboxes.
The database engine and cache in Exchange 2007 has been optimized for scalability resulting in reduced I/O throughput. Larger memory systems allow for a larger amount of cache to be allocated to the Exchange store. This allows for the allocation of more cache on a user-by-user basis. More cache per user increases the probability that data requested by the client will be serviced via memory instead of by the disk subsystem.
In addition, the database page size has been increased from 4 KB to 8 KB. An 8-KB page size means a greater probability that the contents of an entire message will be read during a single I/O operation from a single page in the database.
For more information about these performance improvements, see New Performance and Scalability Functionality.
For detailed planning information for Exchange 2007, see Planning Your Server and Storage Architecture.
Storage Considerations for Maximum Mailbox Size
There are a few key considerations with respect to the establishment of a maximum mailbox size for Exchange servers that support large mailboxes. These include:
-
Maximum database size
-
Database growth factor
-
Maintenance capacity
Database size limits need to be balanced with other factors including backup and recovery time and the complexity of operations management issues related to the increased number of logical unit numbers (LUN) that must be managed to support the high number of databases. Maximum database size should always be calculated as follows: Number of mailboxes × Maximum mailbox size = Database size.
When evaluating the maximum database size that your organization can support, we recommend that servers that do not use continuous replication have 100-GB database size limits. On servers that use continuous replication, we recommend that you limit the database size to 200 GB. You can use the Exchange 2007 ability to support 50 storage groups to scale up with large mailboxes while still meeting these database size criteria.
For more information about planning your Exchange storage configuration, see Planning Storage Configurations.
Outlook Online Mode vs. Cached Mode
When optimizing storage for an Exchange Server 2003 database, size was not a core disk performance issue. The number of items in your core folders (for example, Calendar, Contacts, Inbox, and Sent Item folders), as well as the client type, were primary causes of disk performance issues. This is also true with Exchange 2007. Understanding how item count impacts performance is critically important when planning for the deployment of large mailboxes.
Running Outlook 2007 in cached mode can help reduce server I/O by as much as 70 percent when compared to Exchange 2003. The initial mailbox sync is an expensive operation from a performance perspective, but over time, as the mailbox size grows, the disk subsystem burden is shifted from the Exchange server to the Outlook client. This means that having a large number of items in a user's Inbox will have little effect on the performance of the server. However, this also means that cached mode users with large mailboxes may need faster computers than those with small mailboxes (depending on the individual user perception of acceptable performance).
Recommended minimum client hardware configurations
| Mailbox size | Memory size | Hard disk speed |
| 1 GB | 1 GB | 5,400 RPM |
| 2 GB | 1–2 GB | 7,200 RPM |
| Greater than 2 GB | 2 GB Selective synchronization of folders in cached mode for mobile workers. Online mode required for complete data access. Performance experience is based on specific configurations and performance thresholds. | Not applicable |
Note: |
|---|
|
It is important to keep .ost files free of fragments. You can use the Windows Sysinternals utility Contig to defragment .ost files. To download the latest version of Contig, see Contig v1.54.
|
Both Outlook Web Access and Outlook in online mode store indexes on, and search against, the server's copy of the data. For moderate size mailboxes, this results in approximately double the IOPS per mailbox of a comparably sized cached mode client. The IOPS per mailbox for very large mailboxes is even higher. The first time you sort in a new way, such as by size, an index is created, causing many read I/Os to the database disk. Subsequent sorts on an active index are inexpensive.
For more information about high item counts and restricted views, see Understanding the Performance Impact of High Item Counts and Restricted Views.
There are a few methods that can be used to maintain an acceptable item count level that should be evaluated when planning for large mailboxes. Creating more top level folders, or subfolders underneath the Inbox and Sent Items folders, greatly reduces the performance impact associated with this index creation, so long as the number of items in any one folder does not exceed 5,000. To achieve this, users can manually create these folders and personally manage their core folders, or administrators can implement an automated management process using Exchange 2007 messaging records management (MRM) feature. We highly recommend that the automated management process be used. For more information about automating mailbox management using MRM to manage core folders, see "Client Usability and Mailbox Management" later in this white paper and the Exchange 2007 documentation topic Managing Messaging Records Management.
Return to top
Client Usability and Mailbox Management
One of the biggest challenges for end users when working with e-mail generally, and large mailboxes specifically, is structuring and managing the data so that messages can be found quickly and easily. Relying on the end user to manually manage this mail in a way that will result in the best server performance is unrealistic. Automation is the key to maintaining a usable and high performance large mailbox experience. MRM is the records management technology in Exchange 2007 that helps organizations reduce the legal risks associated with e-mail and other communications. MRM replaces, and improves upon, the Exchange 2003 tool called Mailbox Manager. MRM makes it easier to keep messages that are needed to comply with company policy, government regulations, or legal needs, and to remove content that has no legal or business value. This is accomplished through the use of managed folders, which are folders in the user's mailbox to which managed content settings have been applied. The administrator or the user places these managed folders in the user's mailbox, and then the user sorts individual messages or entire folders into the managed folders according to organization policy. Messages in managed folders are then periodically processed by Exchange according to the folder content settings. When a message reaches a retention limit, it is archived, deleted, or flagged for user attention, or the event is simply logged. MRM can be leveraged to help administrators to provide an automated system to assist users with the job of managing their e-mail. MRM can be used by administrators to create policies to help manage user mailbox data in an automated fashion. To illustrate what is possible, consider the following sample policy:
-
Move any e-mail older than one year to the Managed Folders\Will Expire folder.
-
Configure the rule to wait 90 days before deletion so users can choose to move any e-mail in this folder to another managed folder with a longer expiry policy before it gets deleted.
-
Configure a managed folder similar to Managed Folders\Retain and set a no expiry policy on it. This folder can be used to retain mail that users want to keep longer than a year. To help organize this folder, create subfolders based on the year:
Managed Folders
Retain (No Expiry)
2004
2005
2006
2007
The management of subfolders can be done manually by having users move old messages from the Retain (No Expiry) folder to the correct managed subfolder (for example, messages from the year 2005 are placed in the 2005 folder). Alternately, MRM can be configured to do this automatically. The MRM policies can be set for individual users or for everyone in the organization. Having MRM automate the management of users' old mail ensures that old mail is moved to archive folders in a timely fashion. However, this automation does not provide any flexibility for selecting which messages get moved to the managed folders. If you want to move all messages of a specific age, for all users, to a specific managed folder, an MRM policy can easily be configured to do this. This is ideal for organizations that have unlimited expiry policies and want to provide the ability to easily bulk move old items out of the primary folders and into a managed folder hierarchy.
The .ost file performance is also another factor to consider if there are mailboxes in your organization that exceed 2 GB in size. If users of those mailboxes are running Outlook in cached mode, those users may experience degraded performance as the size of their .ost file grows above 2 GB. To mitigate this issue and improve performance, we recommend that folders containing mail that is accessed infrequently be configured to not synchronize to the .ost file.
Because Outlook 2007 has the flexibility to limit what folders get synchronized to a user's .ost file, you can reduce the size of an .ost file by configuring synchronization policies on a per folder basis. Users can configure their .ost file to only sync folders that contain data they need to regularly access. By configuring the .ost files in this way, users can have mailboxes that exceed 2 GB while still having an acceptable user experience when using cached mode. For those infrequent times when users need to access data that is older than what is configured to synchronize to their .ost file, they can open Outlook in online mode or use Outlook Web Access or Outlook Mobile.
Developing rule sets that will effectively maintain manageable item counts in the core folders is important to maintaining high levels of performance for your Exchange servers. This will require analysis of your user environment and how the MRM rules effect that environment. As you gain an understanding of how the rules impact folder item counts, you can tune the rules to achieve the item count levels that are supportable in your environment.
To learn more about the areas of client usability and mailbox management with Exchange 2007, see the following:
Return to top
Client Throttling
With the release of Exchange 2007, a new feature called remote procedure call (RPC) client throttling is available to administrators to help manage the end-user performance experience and reduce the possibility of server monopolization by a small number of highly active users. RPC client throttling can help prevent client applications from sending too many RPC operations/sec to the Exchange server, which could degrade overall server performance. These client applications include desktop search engines searching through every object inside a user's mailbox, custom applications written to manipulate data located in Exchange mailboxes, enterprise class e-mail archiving products, and customer relationship management (CRM)-enabled mailboxes with e-mail auto-tagging enabled. When a client is identified by the Exchange server as causing a disproportionate impact on the server, the server sends a back-off request to the client to reduce the performance impact on the server. This feature is particularly important when large mailboxes are deployed and in situations where high item counts exist. When users with large mailboxes and with high item counts are very active with their mailboxes, these users can disproportionately impact server performance. When this happens, the server sends a back-off request to those clients thereby minimizing their impact on the server and the rest of the users on the server.
For more information about Exchange 2007 client throttling functionality, see Understanding Exchange 2007 Client Throttling.
Rapid Disaster Recovery with Large Databases and Multiple Databases
Historically, the disaster recovery duration has been one of the biggest issues impacting the ability to deploy large mailboxes. Large mailboxes equate to large databases, which equate to long backup and restore times. Large mailboxes make it more challenging to design a solution that has an acceptable RTO. In the event of a failure or disaster where a database, or entire server, needs to be restored from backup, meeting an acceptable RTO goal is critical. Exchange 2003 provides Volume Shadow Copy Service (VSS) support to address this, but VSS solutions are complicated and expensive because they rely on storage area network (SAN)-based storage with at least two copies of the data (clones). To illustrate this challenge, consider a server with 4,000 user mailboxes that are each 2 GB in size. This is 8 terabytes of database data and 128 GB of logs per day. The best case restore process in this example would be to provision new storage, initiate the restore process, replay the log files, and allow content indexing to crawl the recovered databases. Assuming a Fibre Channel tape device capable of 192 megabytes (MB) per second is used, recovery time would be approximately 12 hours, log replay time would add about two hours, and content indexing would take an additional 12 days. That means the best case to return a server to its state prior to the failure is 12 days and 14 hours. With Exchange 2007, other lower cost options are now available to help meet RTOs and bring services back online rapidly, inexpensively, and with a low level of complexity.
Exchange 2007 now offers the ability to deploy a simple, inexpensive, and fast recovery solution that scales well for use with large mailboxes and also provides high availability. This solution is CCR on direct attached storage.
CCR is a non-shared storage failover cluster solution that uses built-in asynchronous log shipping technology to create and maintain a copy of each storage group on a second server in a failover cluster. CCR is designed to be either a one or two data center solution, providing both high availability and site resilience. CCR is different from clustering in previous versions of Exchange. For details about some of the differences, see Cluster Continuous Replication Resource Model and Cluster Continuous Replication Recovery Behavior.
When CCR is deployed with large mailboxes, it should be the primary recovery and restore mechanism, combined with a weekly full, and daily incremental, off server backup schedule. This solution is effective for either SAN or direct attached storage solutions. We recommend evaluating direct attached storage as a storage option due to its reduced complexity and inexpensive nature.
The following are new Exchange 2007 features that provide additional recovery options, high availability, and backup and restore functionality, which can be used as part of your large mailbox deployment strategy:
- Local continuous replication (LCR) LCR is a single-server solution that uses built-in asynchronous log shipping technology to create and maintain a copy of a storage group on a second set of disks that are connected to the same server as the production storage group. LCR provides log shipping, log replay, and a quick manual switch to a secondary copy of the data.
- Standby continuous replication (SCR) SCR is a new feature introduced in Exchange 2007 SP1. SCR is designed for scenarios that use, or enable the use of, standby recovery servers. SCR extends the existing continuous replication features and enables new data availability scenarios for Exchange 2007 Mailbox servers. SCR uses the same log shipping and replay technology used by LCR and CCR to provide added deployment options and configurations by providing the administrator with the ability to create additional storage group copies. SCR can be used to replicate data from stand-alone Mailbox servers and from clustered mailbox servers. For further details about SCR, see Standby Continuous Replication.
- Improved backup and restore When you use LCR or CCR, Exchange 2007 enables you to offload Exchange-aware VSS backups from the active copy of a database to a passive copy of a database. Taking a VSS snapshot on the passive copy removes the disk load from the production disks during both the checksum integrity (Eseutil), and subsequent copy to disk or tape. This also frees up more time on the production disks to run online maintenance, MRM, and other tasks.
Note: |
|---|
|
The Windows Server Backup tool is not capable of performing backups using VSS. A VSS-enabled backup product such as Microsoft System Center Data Protection Manager is required.
|
- Database portability Database portability provides several features, including the ability to port and recover a database on another server in the Exchange organization. Database portability enables faster disaster recovery strategies to be implemented for both site-level disasters and hardware failures for Exchange 2007 servers. For additional information about database portability, see Database Portability.
- Dial tone portability When a database, server, or data center is lost, you can use dial tone portability to provide access to a new dial tone database on another server in the Exchange organization. For additional information about dial tone portability, see Dial Tone Portability.
If you are not using a VSS backup solution, we highly recommend that disk to disk (D2D) or disk to disk to tape (D2D2T) be considered to increase the speed, efficiency, and reliability of your non-VSS backup solution. If your organization does not have legal or policy obligations requiring long-term retention of tape backups, consider eliminating tape from your backup strategy.
Microsoft System Center Data Protection Manager 2007 offers a fast backup solution with smart technology that can help meet your D2D or D2D2T needs by offering seamless data protection for Exchange by leveraging integrated disk and tape media. For more information about DPM, see System Center Data Protection Manager 2007.
For more information about the high availability and disaster recover features of Exchange 2007, see:
For more information about using Microsoft System Center Data Protection Manager with Exchange 2007, see:
Return to top
Operations Management
Large mailboxes bring with them operational challenges that must be evaluated and mitigated to ensure that the health of your Exchange environment is easily maintained over time. Some areas that have historically been problematic with regard to large mailboxes include:
- Long backup times Performing daily full backups to disk or tape with large mailboxes may not fit in the backup window (too much data in the allowable time).
- High log generation during move mailbox operations Moving large mailboxes around an organization due to massive log file creation can be challenging.
- Long database maintenance times Performing both offline and online database operations may take much longer and possibly exceed maintenance windows. Examples include:
-
Online maintenance (online defragmentation)
-
Offline defragmentation
-
Offline repair
Long Backup Times
There are many different backup and restore methods available to the Exchange administrator. The key metric with backup and restore is the throughput, or the number of megabytes per second that can be copied to and from your production disks. After you determine the throughput, you need to decide if it is sufficient to meet your backup and restore SLA. For example, if you need to be able to complete the backup within four hours, you may have to add more hardware or choose a different method to achieve it.
Large mailboxes require you to handle very large amounts of data on a single Exchange server. Consider a server with two thousand 2-GB mailboxes. When factoring in the disk overhead, this is more than 4 terabytes of data. Assuming you can achieve a backup rate of 175 GB per hour (48 MB per second) using your backup solution, it would take at least 23 hours to back up this Exchange server. By leveraging CCR, with its fast recovery ability, it is possible to move to a backup strategy where a full backup of one-seventh of the databases is performed each day, combined with an incremental backup on the remainder of the databases. This is all possible where the backups can be taken off the passive node, thereby minimizing performance impact and allowing more time for online maintenance and MRM to run. The full backup plus incremental backup strategy is the secondary recovery mechanism, so the additional time to restore the full backup plus all incremental backups may be an acceptable solution. If you choose to leverage CCR plus SCR, you can raise the number of online copies to three copies, which further reduces the likelihood that recovery from backup would be necessary.
High Log Generation During Move Mailbox Operations
The performance impact of moving mailboxes is a factor that affects capacity planning when deploying large mailboxes. For various reasons, most large companies move a percentage of their users on a nightly or weekly basis to different databases, servers, or sites. When deploying large mailboxes, it may be necessary to provision the log drive to accommodate mailbox move activities. Although the source Exchange server will log the record deletions, which are small, the target server must write everything that is transferred to the transaction logs first. If you normally generate 10 GB of log files in one day, and keep a three day buffer of 30 GB, moving fifty 2 GB mailboxes (100 GB) would fill up your target log drive and cause downtime. To account for these additional operational requirements, we recommend that increased capacity be allocated for the log drives. For more information about planning your capacity requirements, see Exchange 2007 Mailbox Server Role Storage Requirements Calculator.
Long Database Maintenance Times
Exchange 2007 provides the ability to reduce maintenance times including the time required to perform online and offline defragmentation and offline repair.
Online Maintenance (Online Defragmentation)
New ESE features in Exchange 2007 SP1 reduce the time required for online maintenance to run while also providing better events and Performance Monitor counters to track online maintenance completion and value over time. These SP1 changes include:
-
Removal of page dependencies
-
Disablement of partial merges
-
Passive node I/O improvements
-
Online defragmentation
-
Checksumming databases
-
Page zeroing
For more information about these new ESE features, see Exchange Server 2007 SP1 ESE Changes - Part 1 and Exchange 2007 SP1 ESE Changes - Part 2.
Note: |
|---|
|
MRM is a scheduled operation that runs against the database in a synchronous read operation similar to backup and online maintenance. The disk cost of MRM depends upon the number of items requiring action (for example, delete or move). We recommend that MRM not run at the same time as either backups or online maintenance. Internal testing at Microsoft shows that MRM can crawl 100,000 items in five minutes. If you use CCR and VSS backups, you can offload the VSS backups to the passive copy, which allows more time for online maintenance and MRM so that neither impacts the other.
|
Offline Defragmentation
This is an unnecessary maintenance step. Some administrators, who have deployed third-party archiving solutions configured to use stub files, choose to do it to compact the database after the message body and attachments have been archived out of Exchange. With the deployment of large mailboxes and the issues related to high item counts, stub files should not be necessary and should be avoided. In the event that a database needs to be compacted, moving mailboxes can be used as an alternative to offline defragmentation. Moving mailboxes has the added benefit of reducing the user impact by limiting service interruption to only those users being moved. Offline defragmentation impacts all users homed on the database. For more information about why offline defragmentation should not be considered regular Exchange maintenance, see Is offline defragmentation considered regular Exchange maintenance?
Offline Repair
This disaster recovery activity is extremely rare when CCR is in use. Offline repair is something you must do to recover your data if your other recovery options fail. You can have a disaster recovery plan that does not require you to perform an offline repair. However, if your disaster recovery plan includes offline repair, we currently recommend that your database size be limited to 200 GB. It is possible to increase this limit by deploying faster disks and disk subsystems. For example, SAS RAID10 arrays can repair databases at over 100 MB per second.
Return to top