White Paper: Planning for Large Mailboxes with Exchange 2007

 

Tom Di Nardo, Senior Technical Writer, Microsoft Exchange Server

July 2008

Summary

This white paper provides an overview of the benefits of deploying large mailboxes with Microsoft Exchange Server 2007. It also provides planning guidance to help you successfully implement the deployment of large mailboxes.

Note

To print this white paper, click Printer Friendly Version in the Web browser.

Applies To

Microsoft Exchange Server 2007

Table of Contents

  • Introduction

  • Benefits of Large Mailboxes

    • Benefits for Users

    • Benefits for Administrators

  • Leveraging Exchange 2007 Features to Address Challenges

    • Performance and Scalability

    • Storage Considerations for Maximum Mailbox Size

    • Outlook Online Mode vs. Cached Mode

    • Client Usability and Mailbox Management

    • Client Throttling

    • Rapid Disaster Recovery with Large Databases and Multiple Databases

    • Operations Management

  • Conclusion

  • Additional Information

Introduction

The volume of e-mail that users receive on a daily basis continues to increase. Concurrent with this increase in the number of e-mail messages that users receive, the size of the average e-mail message also continues to increase. With the increase in both the size and number of e-mail messages that users receive, the amount of time that knowledge workers must spend managing this increasing amount of e-mail is significant. As users spend more time trying to keep their mailbox organized, their overall productivity is reduced. Additionally, because users are forced to spend this time managing e-mail to stay under low quotas, these users become increasingly frustrated with the inability of their Information Technology (IT) departments within their organizations to offer mailbox sizes that match or exceed the mailbox sizes offered by the providers of their personal e-mail accounts. This user frustration can pose a significant threat to corporate intellectual property, as well as regulatory compliance, if users move their corporate e-mail to online web-based e-mail as their primary business e-mail (which has greater than 1 gigabyte mailbox limits).

Earlier versions of Exchange did not scale well enough, at a low enough cost per mailbox (because of expensive hardware and recovery options) to allow IT administrators the ability to match the ever increasing mailbox sizes of personal e-mail accounts. Exchange 2007 offers dramatic performance and scalability improvements when compared to prior versions of Exchange. Long recovery times have been a significant impediment to the adoption of larger mailbox sizes. The introduction of cluster continuous replication (CCR) offers the ability to rapidly recover from outages at a low cost. These performance and rapid recovery improvements enable IT departments to deploy large mailboxes easily and at a low cost. Increased mailbox sizes improve end-user productivity and satisfaction, reduce IT administrative costs, improve security, and help meet business and regulatory compliance requirements.

Note

The phrase large mailbox can mean different things to different people. In this white paper, large mailboxes are defined as being larger than 1 gigabyte (GB) in total size.

Return to top

Benefits of Large Mailboxes

Deploying large mailboxes with Exchange 2007 offers significant benefits for both the end user and the IT administrator.

Benefits for Users

When you deploy large mailboxes with Exchange 2007, in most cases users can have access to a year or more of their e-mail, voice mail, and fax messages natively from either Microsoft Office Outlook 2007 Service Pack 1 (SP1) or the version of Outlook Web Access that is included with Exchange 2007. Consider the following benefits that users can leverage by maintaining mail in their Exchange mailbox:

  • Improved access to Exchange information from anywhere and from any device   By retaining mail in the user's mailbox, a seamless experience can be maintained for accessing data that would previously have been moved to .pst files, archived via a third-party archiving solution, or deleted to stay under a storage quota. By retaining this data in Exchange, users can leverage Windows Mobile powered devices, Outlook 2007, and Outlook Web Access to easily access and act on this data from anywhere.

  • Improved search   There have been significant improvements to the search functionality offered with Exchange 2007 and Outlook 2007. Additionally, users with Windows Mobile 6 devices now have the ability to search their entire mailbox. Users who must search in .pst file archives and third-party archiving solutions must perform the same search multiple times. Although Windows Desktop Search works well to improve search capability on .pst files, this is only useful at a user's primary workstation. By retaining more mail in a user's inbox, the improvements in content indexing and search with Exchange 2007 can be leveraged by any client, making rapid discovery of user information possible.

  • Reduced user mailbox management   By allowing users to have large mailboxes, they are able to spend less time managing their mailboxes to clean up old items.

  • Increased productivity for remote workers   Having Exchange-based access to more e-mail data allows remote workers to securely access their Exchange data from Internet kiosks, computers at remote client locations, their Windows Mobile 6 devices, and from home. When e-mail data is stored in a secure location (Exchange) and is accessed in a secure fashion (Outlook Web Access, Outlook Anywhere, and Windows Mobile), security risks inherent in storing information in non-IT managed locations are reduced.

  • Eliminated .pst files   From an end-user perspective, the ability to live without .pst files offers a significant increase in the usability of the e-mail system. Users no longer need to concern themselves with having to regularly purge e-mail from their mailboxes to stay under low storage quota limits. There is no longer the risk that critical e-mail stored in a .pst file will be lost due to corruption or hardware failure. Elimination of .pst files also ensures that client performance issues due to accessing .pst files over the network are removed. By removing .pst files and storing the data in the mailbox, users can now access their mail while traveling. Additionally, when used with Bitlocker Drive Encryption, traveling workers no longer need to worry about e-mail data loss if their portable computer is lost or stolen. Exmerge is a very useful tool that can import any existing PST Files to your Exchange 2007 information store. BitLocker Drive Encryption is a data protection feature available on Windows Vista Enterprise and Ultimate, Windows 7, and Windows Server 2008.

  • Eliminated third-party archiving stub files   The primary end-user benefit of eliminating stub files when using third-party archiving solutions is providing a consistent mailbox experience from multiple clients. The deployment and management of additional third-party Outlook add-ins are required to make the use of stub solutions work with Outlook. Although some solutions offer access to the archive through Outlook Web Access, none of those solutions offer Windows Mobile device access. In most cases, stub file item size is from 10 to15 kilobytes (KB). The amount of usable data contained in a stub file of that size is minimal, usually a few paragraphs or less. Some of the archival solution providers recommend a stub file size from 1 to 2 KB, which only contains the message header information. Typically, the data accessed in the archive by using a stub solution decreases as the message lifespan increases. If the majority of mail that the user accesses is between current time and two years old, make the mailbox size capable of supporting the messages in that timeframe. For example, if the user mailbox increases by 1 GB per year, the user needs a 3-GB mailbox. If you define your large mailbox design so that the majority of historical data that is required by users remains in their mailboxes, stubs are not required. You still may require a third-party archive for legal requirements and the occasional user access request. If you are retaining a few months of stub files, the number of search hits to review them is relatively low. Therefore, locating the desired message without having to make too many searches through the archive is realistic. However, if several hundred thousand messages are retained, the probability of successfully locating a specific message is challenging when you do not have a significant portion of the message body available. As a result, users probably need to go to the archive multiple times and retrieve many messages to find the desired message. Third-party archiving solutions solve the problem of supporting mailboxes that cannot be bound by size limits and regulatory compliance requirements for archival of all e-mail traffic. However, when deployed, these solutions should be configured to move the e-mail content out of the mailbox without retaining stub files in the mailbox.

Return to top

Benefits for Administrators

By designing an environment to support large mailboxes with Exchange 2007, administrators take steps to reduce costs and improve IT department agility. Deploying large mailboxes with Exchange 2007 offers the following benefits to administrators:

  • Positions IT for high availability and rapid recovery   High availability and disaster recovery have been complex areas in earlier versions of Exchange. The Exchange 2007 CCR feature easily allows administrators to offer a high availability solution to their organization while also moving away from time consuming and complex recovery procedures. CCR offers asynchronous replication of a backup copy of the production data that can be automatically enabled should a failure of the production server occur. When CCR is deployed, standard recovery times as fast as two minutes are now possible.

  • Improved Administrative Search Functionality   Administrators can now search for and extract data from mailboxes more efficiently in Exchange 2007 due to improved content indexing and data storage.

  • Records Management   Messaging Records Management (MRM) lets users categorize their messages and make sure that these messages are retained for legal or IT compliance requirements. Users can also remove messages that have no legal or business value.

  • Positions IT to reduce or eliminate .pst usage   Because .pst files are not generally managed by IT pros, these files are located ultimately on knowledge workers portable computers, desktops, or file shares; or even in public folders. The .pst files can become a significant burden to organizations by impacting server and network performance, reducing usability, and increasing support costs. Generally, .pst files are not backed up, thereby increasing the risk of data loss. Microsoft does not support access to .pst files over networks. For more information, see Microsoft Knowledge Base article 297019, Personal folder files are unsupported over a LAN or over a WAN link.

    A large percentage of knowledge worker performance issues involve accessing .pst files stored on files servers. Additionally, because .pst files are inherently portable, they pose a data security risk and a potential litigation liability with lost or stolen portable computers as well as providing an easy mechanism for employees to take corporate e-mail with them when they leave the company. Finally, .pst files add an expensive manual step when fulfilling compliance requirements and legal discovery requests because they must first be identified and analyzed by administrators.

  • Utilizes today's high capacity disk drives efficiently   Performance is the primary constraint when designing Exchange storage subsystems. Because disk capacity tends to increase at a significantly higher pace than disk performance, Exchange storage subsystems have traditionally been designed with under utilized capacity. Designing Exchange solutions with larger mailboxes allows the customer to efficiently and more fully utilize the capacity of these high capacity disk drives.

  • Reduces end user support overhead   By eliminating .pst files and third-party archiving Outlook add-ins, and by reducing the amount of manual mailbox maintenance that is required with small mailbox size quotas, administrators are able to reduce the number of support calls that must be handled. The amount of time that administrators must spend working with users to reduce mailbox data to stay under low mailbox quotas is significant. Administrators must also spend a large amount of time troubleshooting performance issues due to .pst files stored on the network, recovery of data from corrupt .pst files, and other performance-related issues caused by add-ins and .pst files.

  • Eliminates stub files when using third-party archiving solutions   Third-party archiving solutions have become popular as corporate compliance requirements and mailbox quota management have gained importance. Many of these archiving solutions offer the ability to leave a small stub file in place of the archived message that can be used by end users to retrieve archived messages from the archival system. Some organizations use the stub file solution as a workaround to offering large mailboxes. One of the goals of stub archiving solutions is to reduce the aggregate mailbox and database size, thereby reducing recovery time objectives (RTOs). On the surface, this appears to be a good idea. However, stub-based archiving solutions have the following technical problems:

    • Server performance   Removing the message bodies and attachments from Exchange reduces the mailbox size, but it does not significantly change the server performance for users accessing Exchange via Outlook in online mode and Outlook Web Access. Item counts are the primary performance driver for the Exchange store, and not aggregate size. For example, server performance with a folder containing 100 KB of full e-mail message items is similar to a folder containing 100 KB of stub files.

    • Client complexity   Because the use of stub files with a third-party archiving solution requires the deployment and use of Outlook add-ins, a significant amount of time must be spent by administrators to deploy and manage these add-ins. Administrator time is also required to assist end users with technical difficulties using the add-ins. Not deploying stub files removes all of this additional administrative work that must be performed by administrators and end users.

Return to top

Leveraging Exchange 2007 Features to Address Challenges

The deployment of large mailboxes with Exchange 2007 involves some technical challenges from an IT pro usability perspective as well as from a deployment and operations perspective for the Exchange administrator. A complete understanding of the following areas is the key to success when implementing large mailboxes with Exchange 2007:

  • Performance and scalability

  • Storage considerations for maximum mailbox size

  • Outlook online mode vs. cached mode

  • Client usability and mailbox management

  • Client throttling

  • Rapid disaster recovery with large databases and multiple databases

  • Operations management

By understanding the challenges that large mailboxes pose, administrators can take steps to successfully mitigate potential performance issues, meet service level agreements (SLA) including RTO, maintain operational excellence, and ensure end-user satisfaction.

Performance and Scalability

Large mailboxes require a reduced input/output (I/O) footprint to be a viable option. Exchange 2007 provides I/O reduction features to meet this need by leveraging a new 64-bit architecture and introducing significant changes to the Exchange store and Extensible Storage Engine (ESE).The larger 64-bit addressable space allows Exchange servers to utilize more memory, thereby reducing the required input/output per second (IOPS), enabling the use of larger disks, and allowing lower cost storage solutions such as SATA2 drives.

The database engine and cache in Exchange 2007 has been optimized for scalability resulting in reduced I/O throughput. Larger memory systems allow for a larger amount of cache to be allocated to the Exchange store. This allows for the allocation of more cache on a user-by-user basis. More cache per user increases the probability that data requested by the client will be serviced via memory instead of by the disk subsystem.

Read Operations

In the 32-bit version of Exchange 2003, the ratio of database read operations to database write operations is typically 2:1.  This is because the 32-bit version of Exchange 2003 only allocates as much memory as the database can utilize for read operations, specifically 900 MB.  This means that the Exchange 2003 data cache is constantly flushed so that new database pages can be entered. This also means constant disk read I/O operations.

Exchange 2007 does not have a defined database cache size.  This means that the database can utilize as much memory as the operating system supports. Additionally, you can store more database pages in memory. The larger cache allows these pages to stay in memory longer, and it is more likely that the database pages will be read from memory rather than having to go to disk.  This decreases read I/O operations.

Write Operations

In Exchange 2003, the database page sizes were 4 KB.  The larger the data blob that must be written into the database, the more pages that have to be written. Also, we could only consolidate writes in 64-KB blocks (16 4-KB pages).

In Exchange 2007, the database page sizes are 8 KB.  Additionally, Exchange can consolidate writes in 1-MB blocks. Therefore, more changes can be written to the database in fewer I/O operations.

For more information about these performance improvements, see New Performance and Scalability Functionality.

For detailed planning information for Exchange 2007, see Planning Your Server and Storage Architecture.

Checkpoint Depth

When a transaction is committed, all the changes are written to the transaction log files on disk. However, the database page changes remain in-memory in a “dirty” state.  These are subsequently written to disk by a background thread. Checkpoint depth is the setting that determines how long such a “dirty” page can remain in-memory before it needs to be written to disk. The larger the checkpoint depth, the longer the pages remain dirty.  That means that checkpoint depth is tied to the storage group. The maximum checkpoint depth is 20 MB.

In Exchange 2003, you are limited to four storage groups, with a maximum of 20 databases (5 databases per storage group).  So if you had a storage group fully populated with five mailbox stores and each store had a total of 200 mailboxes, each user would have 0.02 MB for a checkpoint depth.  Therefore, dirty pages have to flush to disk frequently, which causes more write operations and reduces I/O coalescing.

In Exchange 2007, you can have 50 databases and up to 50 storage groups (still the maximum depth is 5 databases per storage group).  So let’s take the same example of 1,000 users, but this time we will have 10 storage groups with 1 database each, with each database containing 100 users.  For each database, the checkpoint depth will be .2 MB which is ten times more than what was available in Exchange 2003 for the same set of users that were stored in a single storage group.

A larger checkpoint depth helps by keeping dirty pages in-memory for a longer time, which helps reduce write I/Os in two ways:

  • Increases the probability that the dirty page is updated again in a future transaction, thereby telescoping multiple writes into one.

  • Increases the probability that its physically adjacent pages also get dirty. In this case, all these contiguous dirty pages can be written to disk in one large I/O, instead of several small I/Os.

It is more efficient to only deploy a single database within a storage group to reduce the I/O impacts Exchange has on storage.

Storage Considerations for Maximum Mailbox Size

There are a few key considerations with respect to the establishment of a maximum mailbox size for Exchange servers that support large mailboxes. These include:

  • Maximum database size

  • Database growth factor

  • Maintenance capacity

Database size limits need to be balanced with other factors including backup and recovery time and the complexity of operations management issues related to the increased number of logical unit numbers (LUN) that must be managed to support the high number of databases. Maximum database size should always be calculated as follows:

Number of mailboxes × Maximum mailbox size x Data overhead factor = Database size

You can use the Exchange 2007 ability to support 50 storage groups to scale up with large mailboxes while still meeting these database size criteria.

For more information about planning your Exchange storage configuration, see Planning Storage Configurations.

Outlook Online Mode vs. Cached Mode

When optimizing storage for an Exchange Server 2003 database, size was not a core disk performance issue. The number of items in your core folders (for example, Calendar, Contacts, Inbox, and Sent Item folders), as well as the client type, were primary causes of disk performance issues. This is also true with Exchange 2007. Understanding how item count impacts performance is critically important when planning for the deployment of large mailboxes.

Running Outlook 2007 in cached mode can help reduce server I/O. The initial mailbox sync is an expensive operation from a performance perspective, but over time, as the mailbox size grows, the disk subsystem burden is shifted from the Exchange server to the Outlook client. This means that having a large number of items in a user's Inbox will have little effect on the performance of the server. However, this also means that cached mode users with large mailboxes may need faster computers than those with small mailboxes (depending on the individual user perception of acceptable performance).

For more information about how to troubleshoot performance issues in Outlook 2007, see Microsoft Knowledge Base article 940226, How to troubleshoot performance issues in Outlook 2007.

Note

If using Outlook 2007 release to manufacturing (RTM) in cached mode, download and install the The 2007 Microsoft Office Suite Service Pack 1 (SP1).

Note

It is important to keep .ost files free of fragments. You can use the Windows Sysinternals utility Contig to defragment .ost files. To download the latest version of Contig, see Contig v1.5x.

Both Outlook Web Access and Outlook in online mode store indexes on, and search against, the server's copy of the data. For moderate size mailboxes, this results in approximately double the IOPS per mailbox of a comparably sized cached mode client. The IOPS per mailbox for very large mailboxes is even higher. The first time you sort in a new way, such as by size, an index is created, causing many read I/Os to the database disk. Subsequent sorts on an active index are inexpensive.

For more information about high item counts and restricted views, see Understanding the Performance Impact of High Item Counts and Restricted Views.

There are a few methods that you can use to maintain an acceptable item count level and that you should evaluate when planning for large mailboxes. Creating more top level folders, or subfolders underneath the Inbox and Sent Items folders, greatly reduces the performance impact associated with this index creation. We recommend that Inbox and Sent Items folder item counts be kept below 20,000 items and that the Contacts and Calendar folder item counts be kept below 5,000.To achieve this, users can manually create these folders and personally manage their core folders, or administrators can implement an automated management process using the Exchange 2007 messaging records management (MRM) feature. We highly recommend that the automated management process be used. For more information about automating mailbox management using MRM to manage core folders, see "Client Usability and Mailbox Management" later in this white paper and the Exchange 2007 documentation topic, Managing Messaging Records Management.

These recommended maximum item count values vary depending on whether any third-party programs access user mailboxes. The following applications can have a performance impact as the number of items in a single folder increases:

  • Outlook add-ins

  • E-mail archiving programs

  • Antivirus programs

  • Mobile device programs

  • Voice mail programs

For more information about high item counts and restricted views, see Understanding the Performance Impact of High Item Counts and Restricted Views.

Return to top

Client Usability and Mailbox Management

One of the biggest challenges for end users when working with e-mail generally, and large mailboxes specifically, is structuring and managing the data so that messages can be found quickly and easily. Relying on the end user to manually manage this mail in a way that will result in the best server performance is unrealistic. Automation is the key to maintaining a usable and high performance large mailbox experience. MRM is the records management technology in Exchange 2007 that helps organizations reduce the legal risks associated with e-mail and other communications. MRM replaces, and improves upon, the Exchange 2003 tool called Mailbox Manager. MRM makes it easier to keep messages that are needed to comply with company policy, government regulations, or legal needs, and to remove content that has no legal or business value. This is accomplished through the use of managed folders, which are folders in the user's mailbox to which managed content settings have been applied. The administrator or the user places these managed folders in the user's mailbox, and then the user sorts individual messages or entire folders into the managed folders according to organization policy. Messages in managed folders are then periodically processed by Exchange according to the folder content settings. When a message reaches a retention limit, it is archived or deleted, or the event is simply logged. MRM can be leveraged to help administrators to provide an automated system to assist users with the job of managing their e-mail. MRM can be used by administrators to create policies to help manage user mailbox data in an automated fashion. To illustrate what is possible, consider the following sample policy:

  • Move any e-mail older than one year to the Managed Folders\Will Expire folder.

  • Configure the policy on the Will Expire folder to wait 90 days before deletion so users can choose to move any e-mail in this folder to another managed folder with a longer expiry policy before it gets deleted.

  • Configure a managed folder similar to Managed Folders\Retain and set a long expiry policy, such as one year on the managed folder. This folder can be used to retain mail that users want to keep longer than a year. To help organize this folder, create subfolders based on the year:

                     Managed Folders

                           Retain (No Expiry)

                                 2004

                                 2005

                                 2006

                                 2007

The management of subfolders can be done manually by having users move old messages from the Retain (No Expiry) folder to the correct managed subfolder (for example, messages from the year 2005 are placed in the 2005 folder). The MRM policies can be set for individual users or for everyone in the organization. Having MRM automate the management of users' old mail ensures that old mail is moved to archive folders in a timely fashion. However, this automation does not provide any flexibility for selecting which messages get moved to the managed folders. If you want to move all messages of a specific age, for all users, to a specific managed folder, an MRM policy can easily be configured to do this. This is ideal for organizations that have unlimited expiry policies and want to provide the ability to easily bulk move old items out of the primary folders and into a managed folder hierarchy.

The .ost file performance is another factor to consider if there are mailboxes in your organization that exceed 2 GB. If users of those mailboxes are running Outlook 2007 in cached mode, those users may experience degraded performance as the size of their .ost file grows to more than 2 GB.

To reduce the effect of this issue and to improve performance, we recommend that you configure folders that contain mail that is accessed infrequently not to synchronize to the .ost file.

However, if you install the 2007 Microsoft Office Suite Service Pack 2 (SP2), you should see improved performance and responsiveness if you use Cached Exchange Mode. If you install the 2007 Microsoft Office Suite SP2, use the following guidelines:

  • Up to 5 gigabytes (GB): This size should provide a good user experience on most hardware.

  • Between 5 GB and 10 GB: This size is typically hardware dependent. Therefore, if you have a fast hard disk and much RAM, your experience will be better. However, slower hard disks, such as hard disk that are typically found on portable computers or early-generation solid state drives (SSDs), experience some application pauses when the drives respond.

  • More than 10 GB: This is the size in which short pauses begin to occur on most hardware.

  • More than 25 GB: This very large size increases the frequency of the short pauses, especially when you download new e-mail messages. Or, you can use Send/Receive groups to manually sync your mail.

Because Outlook 2007 and Outlook 2003 have the flexibility to limit which folders are synchronized to a user's .ost file, you can reduce the size of an .ost file by configuring synchronization policies on a per-folder basis. Users can configure their .ost file to sync only folders that contain data that they have to regularly access. By configuring the .ost files in this way, users can have mailboxes that exceed the recommended size and still have an acceptable user experience when they use cached mode. For those infrequent times when users have to access data that is older than what is configured to synchronize to their .ost file, they can open Outlook in online mode or use Outlook Web Access or Outlook Mobile.

Developing rule sets that effectively maintain manageable item counts in the core folders is important to maintaining high levels of performance for your Exchange servers. This will require analysis of your user environment and how the MRM rules effect that environment. As you gain an understanding of how the rules impact folder item counts, you can tune the rules to achieve the item count levels that are supportable in your environment.

To learn more about the areas of client usability and mailbox management with Exchange 2007, see the following:

Return to top

Client Throttling

With the release of Exchange 2007, a new feature called remote procedure call (RPC) client throttling is available to administrators to help manage the end-user performance experience and reduce the possibility of server monopolization by a small number of highly active users. RPC client throttling can help prevent client applications from sending too many RPC operations/sec to the Exchange server, which could degrade overall server performance. These client applications include desktop search engines searching through every object inside a user's mailbox, custom applications written to manipulate data located in Exchange mailboxes, enterprise class e-mail archiving products, and customer relationship management (CRM)-enabled mailboxes with e-mail auto-tagging enabled. When a client is identified by the Exchange server as causing a disproportionate impact on the server, the server sends a back-off request to the client to reduce the performance impact on the server. This feature is particularly important when large mailboxes are deployed and in situations where high item counts exist. When users with large mailboxes and with high item counts are very active with their mailboxes, these users can disproportionately impact server performance. When this happens, the server sends a back-off request to those clients thereby minimizing their impact on the server and the rest of the users on the server.

For more information about Exchange 2007 client throttling functionality, see Understanding Exchange 2007 Client Throttling.

You can also disable MAPI application access to an Exchange 2007 computer. You can disable MAPI client access according to the application executable name. You can use this feature to help prevent problematic or beta client MAPI applications from running against an Exchange Server computer.

Rapid Disaster Recovery with Large Databases and Multiple Databases

Historically, the disaster recovery duration has been one of the biggest issues impacting the ability to deploy large mailboxes. Large mailboxes equate to large databases, which equate to long backup and restore times. Large mailboxes make it more challenging to design a solution that has an acceptable RTO. In the event of a failure or disaster where a database, or entire server, needs to be restored from backup, meeting an acceptable RTO goal is critical. To achieve an acceptable RTO with large mailboxes in Exchange 2007, you must use a storage area network-based solution and a hardware-based Volume Shadow Copy Service (VSS) solution that maintains two copies of the data. However, this scenario does not protect your data from hardware failure as both the active copy and the additional copy reside on the same array.

To illustrate this challenge, consider a server with 4,000 user mailboxes that contain a large amount of mail and where each message is 50 KB in size. This is 8 terabytes of database data and 128 GB of logs per day. To decrease the restore time, you could provision new storage, initiate the restore process, and then replay the log files. Assuming a Fibre Channel tape device capable of 192 megabytes (MB) per second is used, recovery time would be approximately 12 hours, log replay time would add about two hours, and content indexing would take an additional 12 days. That means the best case to return a server to its state prior to the failure is 12 days and 14 hours. With Exchange 2007, other lower cost options are now available to help meet RTOs and bring services back online rapidly, inexpensively, and with a low level of complexity.

Exchange 2007 now offers the ability to deploy a simple, inexpensive, and fast recovery solution that scales well for use with large mailboxes and also provides high availability. This solution is CCR on direct attached storage.

CCR is a non-shared storage failover cluster solution that uses built-in asynchronous log shipping technology to create and maintain a copy of each storage group on a second server in a failover cluster. CCR is designed to be either a one or two data center solution, providing both high availability and site resilience. CCR is different from clustering in previous versions of Exchange. For details about some of the differences, see Cluster Continuous Replication Resource Model and Cluster Continuous Replication Recovery Behavior.

When CCR is deployed with large mailboxes, it should be the primary recovery and restore mechanism, combined with a weekly full, and daily incremental, off server backup schedule. This solution is effective for either SAN or direct attached storage solutions.

The following are new Exchange 2007 features that provide additional recovery options, high availability, and backup and restore functionality, which can be used as part of your large mailbox deployment strategy:

  • Standby continuous replication (SCR)   SCR is a new feature introduced in Exchange 2007 SP1. SCR is designed for scenarios that use, or enable the use of, standby recovery servers. SCR extends the existing continuous replication features and enables new data availability scenarios for Exchange 2007 Mailbox servers. SCR uses the same log shipping and replay technology used by LCR and CCR to provide added deployment options and configurations by providing the administrator with the ability to create additional storage group copies. SCR can be used to replicate data from stand-alone Mailbox servers and from clustered mailbox servers. For further details about SCR, see Standby Continuous Replication.

  • Improved backup and restore   When you use LCR or CCR with a hardware-based VSS solution, Exchange 2007 enables you to offload Exchange-aware VSS backups from the active copy of a database to a passive copy of a database. Taking a VSS snapshot on the passive copy removes the disk load from the production disks during both the checksum integrity (Eseutil), and subsequent copy to disk or tape. This also frees up more time on the production disks to run online maintenance, MRM, and other tasks.

    Note

    At this point in time, the Windows Server Backup tool is not capable of performing backups using VSS. A VSS-enabled backup product such as Microsoft System Center Data Protection Manager is currently required. A VSS-based plug-in for Windows Server Backup is currently in development. This plug-in lets you to properly backup and restore Exchange 2007 with a built-in Windows 2008 backup application

  • Database portability   Database portability provides several features, including the ability to port and recover a database on another server in the Exchange organization. Database portability enables faster disaster recovery strategies to be implemented for both site-level disasters and hardware failures for Exchange 2007 servers. For additional information about database portability, see Database Portability.

  • Dial tone portability   When a database, server, or data center is lost, you can use dial tone portability to provide access to a new dial tone database on another server in the Exchange organization. For additional information about dial tone portability, see Dial Tone Portability.

If you are not using a VSS backup solution, we highly recommend that disk to disk (D2D) or disk to disk to tape (D2D2T) be considered to increase the speed, efficiency, and reliability of your non-VSS backup solution. If your organization does not have legal or policy obligations requiring long-term retention of tape backups, consider eliminating tape from your backup strategy. For information, see the following Exchange Design whitepapers:

Microsoft System Center Data Protection Manager 2007 offers a fast backup solution with smart technology that can help meet your D2D or D2D2T needs by offering seamless data protection for Exchange by leveraging integrated disk and tape media. For more information about DPM, see System Center Data Protection Manager 2007.

For more information about the high availability and disaster recover features of Exchange 2007, see:

For more information about using Microsoft System Center Data Protection Manager with Exchange 2007, see:

Return to top

Operations Management

Large mailboxes bring with them operational challenges that must be evaluated and mitigated to ensure that the health of your Exchange environment is easily maintained over time. Some areas that have historically been problematic with regard to large mailboxes include:

  • Long backup times   Performing daily full backups to disk or tape with large mailboxes may not fit in the backup window (too much data in the allowable time).

  • High log generation during move mailbox operations   Moving large mailboxes around an organization due to massive log file creation can be challenging.

  • Long database maintenance times   Performing both offline and online database operations may take much longer and possibly exceed maintenance windows. Examples include:

    • Online maintenance (online defragmentation)

    • Offline defragmentation

    • Offline repair

Long Backup Times

There are many different backup and restore methods available to the Exchange administrator. The key metric with backup and restore is the throughput, or the number of megabytes per second that can be copied to and from your production disks. After you determine the throughput, you need to decide if it is sufficient to meet your backup and restore SLA. For example, if you need to be able to complete the backup within four hours, you may have to add more hardware or choose a different method to achieve it.

Large mailboxes require you to handle very large amounts of data on a single Exchange server. Consider a server with two thousand 2-GB mailboxes. When factoring in the disk overhead, this is more than 4 terabytes of data. Assuming you can achieve a backup rate of 175 GB per hour (48 MB per second) using your backup solution, it would take at least 23 hours to back up this Exchange server. By leveraging CCR, with its fast recovery ability, it is possible to move to a backup strategy where a full backup of one-seventh of the databases is performed each day, combined with an incremental backup on the remainder of the databases. This is all possible when using a hardware-based VSS solution where the backups can be taken off the passive node, thereby minimizing performance impact and allowing more time for online maintenance and MRM to run. The full backup plus incremental backup strategy is the secondary recovery mechanism, so the additional time to restore the full backup plus all incremental backups may be an acceptable solution.

Note

Using the full backup plus incremental backup strategy to restore data to a production site only occurs during a cascading failure. Otherwise these restores will only be performed to extract data that has been removed from the dumpster.

If you choose to leverage CCR plus SCR, you can raise the number of online copies to three copies, which further reduces the likelihood that recovery from backup would be necessary.

High Log Generation During Move Mailbox Operations

The performance impact of moving mailboxes is a factor that affects capacity planning when deploying large mailboxes. For various reasons, most large companies move a percentage of their users on a nightly or weekly basis to different databases, servers, or sites. When deploying large mailboxes, it may be necessary to provision the log drive to accommodate mailbox move activities. Although the source Exchange server will log the record deletions, which are small, the target server must write everything that is transferred to the transaction logs first. If you normally generate 10 GB of log files in one day, and keep a three day buffer of 30 GB, moving fifty 2 GB mailboxes (100 GB) would fill up your target log drive and cause downtime. To account for these additional operational requirements, we recommend that increased capacity be allocated for the log drives. For more information about planning your capacity requirements, see Exchange 2007 Mailbox Server Role Storage Requirements Calculator.

Long Database Maintenance Times

Exchange 2007 provides the ability to reduce maintenance times including the time required to perform online and offline defragmentation and offline repair.

Online Maintenance (Online Defragmentation)

New ESE features in Exchange 2007 SP1 reduce the time required for online maintenance to run while also providing better events and Performance Monitor counters to track online maintenance completion and value over time. These SP1 changes include:

  • Removal of page dependencies

  • Disablement of partial merges

  • Passive node I/O improvements

  • Online defragmentation

  • Checksumming databases

  • Page zeroing

For more information about these new ESE features, see Exchange Server 2007 SP1 ESE Changes - Part 1 and Exchange 2007 SP1 ESE Changes - Part 2.

Note

MRM is a scheduled operation that runs against the database in a synchronous read operation similar to backup and online maintenance. The disk cost of MRM depends upon the number of items requiring action (for example, delete or move). We recommend that MRM not run at the same time as either backups or online maintenance. Internal testing at Microsoft shows that MRM can crawl 100,000 items in five minutes. If you use CCR and VSS backups, you can offload the VSS backups to the passive copy, which allows more time for online maintenance and MRM so that neither impacts the other.

Offline Defragmentation

This is an unnecessary maintenance step. Some administrators, who have deployed third-party archiving solutions configured to use stub files, choose to do it to compact the database after the message body and attachments have been archived out of Exchange. With the deployment of large mailboxes and the issues related to high item counts, stub files should not be necessary and should be avoided. In the event that a database needs to be compacted, moving mailboxes can be used as an alternative to offline defragmentation. Moving mailboxes has the added benefit of reducing the user impact by limiting service interruption to only those users being moved. Offline defragmentation impacts all users homed on the database. For more information about why offline defragmentation should not be considered regular Exchange maintenance, see Is offline defragmentation considered regular Exchange maintenance?

Offline Repair

This disaster recovery activity is extremely rare when CCR is in use. Offline repair is something you must do to recover your data if your other recovery options fail. You can have a disaster recovery plan that does not require you to perform an offline repair. However, if your disaster recovery plan includes offline repair, we currently recommend that your database size be limited to 200 GB. It is possible to increase this limit by deploying faster disks and disk subsystems. For example, SAS RAID10 arrays can repair databases at over 100 MB per second.

Return to top

Conclusion

By deploying large mailboxes with Exchange in your organization and leveraging the high availability fast recovery features of CCR and the automated mailbox management features of MRM, you will enjoy significant improvements in knowledge worker productivity and service availability, at a lower cost per mailbox with improved end-user satisfaction and reduced administrative overhead.

Additional Information

For the complete Exchange 2007 documentation, see Exchange Server 2007 Help.

For more information about Exchange 2007, see the following resources: