New Exchange Core Store Functionality
Applies to: Exchange Server 2010 SP2
Topic Last Modified: 2012-02-28
Microsoft Exchange Server 2010 includes many improvements to the Exchange database architecture:
Public folder reporting has been enhanced.
Databases are no longer associated with storage groups. Storage groups have been removed.
Investments in store schema and Extensible Storage Engine (ESE) optimizations have reduced IOPS by 70 percent.
The following sections describe these improvements in more detail.
Public folder reporting has been enhanced to view user-initiated changes to any item in the public folder. You can view this information by using the Get-PublicFolderStatistics cmdlet in the Exchange Management Shell. For more information, see Exchange Management Shell.
Databases are no longer associated with storage groups. In Exchange 2010, storage group functionality has been moved to the database.
In Exchange 2010, you can manage mailbox and public folder databases in the Organization Configuration node of the EMC. (In Exchange Server 2007, database management was performed in the Server Configuration node.)
Although public folder database management has been moved from the Server Configuration node to the Organization Configuration node with the mailbox databases, the functionality of public folder databases hasn't changed in Exchange 2010. Just like in Exchange 2007, you can't create database copies of public folder databases, and you can't add public folder databases to a database availability group (DAG). However, public folder databases can be hosted on Mailbox servers that are part of a DAG, although public folder databases won't be subject to log shipping or any other DAG features.
With the removal of storage groups in Exchange 2010, the storage group cmdlets used in Exchange 2007 were deleted and the Exchange 2010 database cmdlets now provide the functionality, as shown in the following tables.
Database cmdlets in Exchange 2010 that replace Exchange 2007 storage group cmdlets
|Exchange 2007 cmdlet||Description of functionality change in Exchange 2010|
This cmdlet has been deleted, and configuration parameters were moved to the New-MailboxDatabase and New-PublicFolderDatabase cmdlets.
This cmdlet has been deleted, and configuration parameters were moved to the Remove-MailboxDatabase and Remove-PublicFolderDatabase cmdlets.
This cmdlet has been deleted, and configuration parameters were moved to the Set-MailboxDatabase and the Set-PublicFolderDatabase cmdlets.
This cmdlet has been deleted, and configuration parameters were moved to the Get-MailboxDatabase and Get-PublicFolderDatabase cmdlets.
This cmdlet has been deleted, and configuration parameters were moved to the Move-DatabasePath cmdlet.
Database cmdlets in Exchange 2010 that have extended functionality from Exchange 2007 cmdlets
|Exchange 2010 cmdlet||Description of extended functionality in Exchange 2010|
These cmdlets have been extended with the parameters and functionality from the New-StorageGroup cmdlet. They also update the server object with a link to the new database and the database object with the hosting server name.
These cmdlets have been extended with the parameters and functionality from the Remove-StorageGroup cmdlet. In addition, they also update the server object with the link to the new database and the database object with the hosting server name.
These cmdlets have been extended with the parameters and functionality from the Set-StorageGroup cmdlet. When changing the host servers, they also update the server object with the link to the new database and the database object with the hosting server name.
These cmdlets have been extended with the parameters and functionality from the Get-StorageGroup cmdlet. The Status parameter is extended to return the status information currently returned by the Get-StorageGroupCopyStatus cmdlet.
This cmdlet has been extended with the parameters and functionality from the Move-StorageGroupPath cmdlet.
In addition to the preceding cmdlet changes, the StorageGroupCopy cmdlets have been deleted. For more information, see Managing Mailbox Database Copies.
In Exchange 2010, the store schema has been changed to remove the dependency of mailbox databases on the server object. In addition, the new schema has been improved to help reduce database I/O per second (IOPS) by refactoring the tables used to store information. Refactoring the tables allows higher logical contiguity and locality of reference. These changes reduce the store's reliance on the secondary indexes maintained by ESE. As a result, the store is no longer sensitive to performance issues related to the secondary indexes.
Store resilience and health has also been improved by adding several features related to detecting and correcting errors and providing alerts, such as the following:
Mailbox quarantine on rogue mailboxes
Transport cutoff to databases with less than 1 GB of space
Thread time-out detection and reporting
For more information about store resilience and health, see Understanding the Exchange 2010 Store.
Core store functionality has received many changes to improve high availability features. High availability has been integrated into the core architecture of Exchange 2010 to enable organizations of all sizes and in all industry segments to economically deploy a messaging continuity service. For more information about the high availability changes in Exchange 2010, see Understanding High Availability and Site Resilience.
Extensible Storage Engine (ESE) has been improved in Exchange 2010 to achieve the following goals:
Larger I/O and sequential I/O to reduce IOPS
Optimization for commodity storage
Database management reduction
Online database scanning
By increasing the size of the I/O and reducing the frequency of read/writes in Exchange 2010, ESE is able to increase performance. In addition, ESE can increase performance by making the data in the database more sequential, which increases the likelihood that related data is in the same vicinity in the B-tree.
In Exchange, all data inside the database is stored in B-trees, and the B-trees are then divided into pages. In Exchange 2007 and earlier, the data stored in the B-trees isn't contiguous. In fact, previous versions of Exchange performed random read/writes to the database. This means that related data may not be in the same vicinity on the hard disk. Non-contiguous data requires more passes to read and write to the hard disk.
The B-tree defragmentation process has been improved to reduce I/O operations by maintaining contiguous data in the B-tree.
B-tree defragmentation is performed in-place (as opposed to creating a new B-tree and renaming the indexes and tables) with three new operations:
- Page move A page move consists of moving all data from one page to a newly allocated page.
- Partial left merge A partial left merge is the same as a right merge in Exchange 2007 or earlier, except that data is moved from the left page to the right page.
- Full left merge A full left merge is the same as a full right merge in Exchange 2007 or earlier.
Defragmentation has been changed from right merges to left merges to optimize performance. Data is read from or written to the hard disk from right to left. If the database is being defragmented in the same direction as the read/writes, defragmentation will conflict with the read/writes. In addition, space allocation allows the next page in an extent to be allocated, but not the previous page. Because a page move needs to allocate a new page, defragging the database from left to right is much more efficient.
The Defragmentation Manager is a new event in ESE that monitors which B-trees require defragmenting and which B-trees have already been defragmented. The Defragmentation Manager compiles a list of the B-trees in all mounted databases that should be defragmented. As fragmented B-trees are discovered, they're registered with the Defragmentation Manager, and the Defragmentation Manager will process them.
All data inside the database is stored in B-trees, and the B-trees are divided into pages. The page size is the minimum size for reading and writing to the database; it's also the unit size used for database caching. Reading from the disk is slower than performing operations in memory; therefore, by increasing the page size to 32 KB, ESE reduces IOPS, which increases performance by caching the larger page size in memory.
Another of the goals of ESE in Exchange 2010 is to reduce the capital and operational costs of deploying Exchange. This can be done by reducing storage costs and optimizing for commodity storage using JBOD and SATA class hard disks.
Disk subsystems are more efficient at handling fewer but larger I/O. In Exchange 2010 or earlier, the page size is the minimum read/write size and the minimum size for database caching. Coalescing I/Os refers to the process of combining database page operations into a single I/O operation, thereby producing fewer and bigger I/O operations.
Increasing the average database I/O sizes via coalescing I/Os has the following benefits:
- Increased disk use efficiency Disks are more efficient at processing large I/Os. The more efficiently the disk is utilized, the more mailboxes can be hosted on that disk.
- Increased cache warming rate Cache warming is a process that helps reduce the execution times by preloading the initial queries that were executed against a database the last time the database was started. After a server restart, failover, or switchover, the larger I/O allows ESE to increase the rate at which the cache is warmed.
One of the goals of ESE in Exchange 2010 is to reduce the cost of maintaining and managing a database. Database maintenance is comprised of several tasks that manage and keep the integrity of your mailbox database.
Database maintenance is divided into the following:
Store mailbox maintenance
ESE database maintenance
In Exchange 2007, ESE database maintenance was disk-intensive. In Exchange 2010, improvements have been made to increase performance. In Exchange 2010, on large or very heavy profile servers, the store mailbox maintenance task only lasts approximately 45 minutes, while ESE database maintenance usually took from six to eight hours per night to complete on large Exchange 2007 databases (2 GB quotas).
In Exchange 2010, improvements have been made to support both large mailboxes as well as to support JBOD storage and storage without the use of RAID.
|All Exchange Store-focused online database maintenance functions such as recovery item cleanup are the same in Exchange 2010 as they are in Exchange 2007. Only ESE functions, online defragmentation, and database checksumming have changed.|
Defragmentation makes the internal pages of an Exchange database contiguous. Defragmentation can either be performed automatically by the system while the database is online (online defragmentation) or manually by an administrator when the database is offline (offline defragmentation).
In Exchange 2010, the architecture for online defragmentation has changed. Online defragmentation was moved out of the Mailbox database maintenance process. Online defragmentation now runs in the background 24×7. Because online defragmentation runs all the time, Exchange no longer posts events to the event log indicating the amount of white space in the database. During background database maintenance, items marked for removal from the database are removed, which frees up database pages. The percentage of white space is constantly changing due to the efforts of the continuous online defragmentation process.
You can estimate the amount of white space in the database by knowing the amount of mail sent and received by the users with mailboxes in the database. For example, if you have 100 2-GB mailboxes (total of 200 GB) in a database where users send and receive an average of 10 MB of mail per day, the amount of white space is approximately 1 GB (100 mailboxes × 10 MB per mailbox). The amount of white space can exceed this estimate if background database maintenance isn't able to complete a full pass.
You don't need to configure any settings for this feature. Exchange monitors the database as it's being used, and small changes are made over time to keep it defragged for space and contiguity. If the database analyzes a range of pages and finds that they aren't as sequential as they should be, it starts an async thread to defragment that section of the B-tree/table. Online defragmentation is also throttled so it doesn't have a negative impact on client performance.
Use the ESE performance counter set MSExchange Database ==> Defragmentation Tasks to see the tasks that are performed. For more information, see How to Enable Extended ESE Performance Counters.
Offline defragmentation is a manual process that is performed by an administrator while the database is in a dismounted (offline) state. In this process, the ESEUTIL tool is used to read the database file and write a new database file using the contents in a contiguous fashion. The offline defragmentation process doesn’t copy the white space from the original database; therefore, the size of the newly created database file is smaller than the original database on disk (potentially much smaller, depending on the amount of white space in the database). Historically, the typical reasons for performing an offline defragmentation of a database included the following:
To shrink the size of the database file on disk
To reclaim white space in a database
To avoid low free disk space
To repair a damaged database (the second step in the repair following ESEUTIL /p)
Offline defragmentation has never been part of regular maintenance for Exchange databases and, for some time now, Microsoft has recommended against regular, proactive offline defragmentation of databases. This recommendation was made for a variety of reasons, including the following:
It results in downtime because you have to take the database offline.
In a replicated mailbox database environment, it results in the need to re-seed all passive copies of an active copy that has been defragmented offline, and it results in the need to re-seed any passive copy that has been defragmented offline. (Thus, you should never perform an offline defragmentation of a passive database copy.)
It results in the creation of a new database, with a new database signature, and that eliminates the ability to restore log files from a backup of the database that was taken prior to offline defragmentation.
As an alternative to offline defragmentation, we recommend that customers create a new database and move the mailboxes to the newly created database. In an Exchange 2010 environment, the mailboxes are moved online with no interruption in service to end users. In addition, when you move all mailboxes from an existing database to a new database, the end result is the same: A defragmented database with pages written contiguously and with no appreciable white space in the database file. After that process is complete, you simply delete the old (now empty) database. This guidance only covers proactive offline defragmentation to reclaim white space. You should still perform defragmentation if directed to do so by Microsoft Customer Support Services.
Online database scanning (also known as database checksumming) has also changed. In Exchange 2007 Service Pack 1 (SP1), there was an option to use half of your online defragmentation time for this database scanning process (to ensure Exchange read every page from your database in a specific period of time to detect any corruptions).
In Exchange 2010, online database scanning checksums the database and performs post Exchange 2010 Store crash operations. Space can be leaked due to crashes, and online database scanning finds and recovers lost space. The system in Exchange 2010 is designed with the expectation that every database is fully scanned once every seven days. A warning event is fired if a database isn’t completely scanned in this timeframe. In Exchange 2010, there are now two modes to run online database scanning on active database copies:
Run as the last task in the scheduled Mailbox Database Maintenance process. You can configure how long it runs by changing the Mailbox Database Maintenance schedule. You can use this option for smaller databases that are less than 1 terabyte (TB) in size, which require less time to complete a full scan.
Run in the background 24×7, the default behavior. This option works well for all database sizes, but it’s recommended for large database sizes (1-2 TB in size). Exchange scans the database no more than once per day. This read I/O is 100 percent sequential (which makes it easy on the disk) and equates to a scanning rate of about 5 megabytes (MB)/sec on most systems.
For more information about configuring database maintenance, see Maintain Mailbox Databases.