Export (0) Print
Expand All

Understanding Database and Log Performance Factors

Topic Last Modified: 2010-07-01

This topic discusses database and log I/O performance factors in Microsoft Exchange Server 2010. Understanding these factors is important to your Mailbox server storage design solution. For additional information about other key aspects of the design process, see Mailbox Server Storage Design.

Contents

Transactional I/O

Understanding IOPS

Non-Transactional I/O

Transactional I/O is generally defined as the I/O generated by user activity. Examples of user activity include receiving, sending, and deleting items; syncing a Windows Mobile client; or logging on via Microsoft Office Outlook Web App.

Transactional I/O is a critical piece of Exchange 2010 storage design because the I/O latency (how long it takes to execute the I/O operation) can directly affect the user experience of online clients such as Microsoft Outlook Online Mode and Outlook Web App. Cached Exchange Mode in Outlook can also be affected by high I/O latency when it's being used for tasks such as delegate access and configuring rules. All clients can be affected by e-mail delivery delays caused by high latency I/O. Transactional I/O can be divided into database volume I/O and log volume I/O.

The transactional I/O requirements in Exchange 2010 have been reduced from those in Exchange Server 2007. Not all I/O that occurs against the Mailbox database and log volumes is considered transactional. For more information, see Understanding the Exchange 2010 Store.

Return to top

For all versions of Exchange, it's important to understand the amount of database I/O per second (IOPS) consumed by each user because it's one of the key transactional I/O metrics needed for adequately sizing storage. The following sections discuss factors that affect IOPS when designing your Mailbox server role storage.

A 64-bit edition of the Windows Server operating system running the 64-bit version of Exchange 2010 substantially increases the virtual address space and allows Exchange to increase its database cache, reduce database read I/O, and enable up to 100 databases per server.

The database read reduction depends on the amount of database cache available to the server and the user message profile. For guidance about memory and databases, see Understanding the Mailbox Database Cache. Following the guidance in that topic can result in up to a 90 percent transactional I/O reduction over Exchange Server 2003. The amount of database cache per user is a key factor in the actual I/O reduction.

The following table demonstrates the increase in actual database cache per mailbox when comparing the default 900 megabytes (MB) of database cache per mailbox in Exchange 2003 versus 6 MB of database cache per mailbox in Exchange 2010 for a user population that uses a 100 messages / day profile. It's the additional database cache In Exchange 2010 that enables more read hits in cache, thus reducing database reads at the disk level.

Database cache sizes based on mailbox count

Mailbox count Exchange 2003 database cache per mailbox (MB) Exchange 2010 database cache per mailbox (MB) Database cache increase over Exchange 2003

4000

0.225

6

27 times

2000

0.45

6

13 times

1000

0.9

6

7 times

500

1.8

6

5 times

The two most significant factors that can be used to predict Exchange 2010 database IOPS are the amount of database cache per user and the number of messages each user sends and receives per day. The following table is based on a standard worker who uses Outlook 2010 in Cached Exchange Mode. The information has been tested to be accurate within plus or minus 20 percent. Other client types and usage scenarios may yield inaccurate results. The predictions are only valid for user database cache sizes between 3 MB and 30 MB. The information hasn't been validated in a scenario where users send and receive over 500 messages per day. The average message size for validation was 75 KB, but message size isn't a primary factor for IOPS.

The table provides estimated values for IOPS per user that you can use to predict your baseline Exchange 2010 IOPS requirements and includes all database I/O (database, content indexing, and NTFS metadata). It doesn't include log volume I/O.

Database cache and estimated IOPS per mailbox based on message activity

Messages sent/received per mailbox per day Database cache per mailbox (MB) Single database copy (stand-alone): Estimated IOPS per mailbox Multiple database copies (mailbox resiliency): Estimated IOPS per mailbox

50

3

0.06

0.05

100

6

0.120

0.100

150

9

0.18

0.150

200

12

0.240

0.200

250

15

0.300

0.250

300

18

0.360

0.300

350

21

0.420

0.350

400

24

0.480

0.400

450

27

0.540

0.450

500

30

0.600

0.500

Mailbox resiliency refers to a unified high availability and site resilience solution in Exchange 2010. For more information, see Understanding High Availability and Site Resilience.

Database volume I/O is I/O associated with database file (.edb) read/write activity, content indexing read/write activity, as well as NTFS metadata read/write activity.

In Exchange 2003, the database read/write ratio is typically 2:1 or 66 percent reads. With Exchange 2010, the larger database cache decreases the number of reads to the database on disk causing the reads to shrink as a percentage of total I/O.

If you follow the recommended memory guidelines, you can expect to see the following I/O ratios for active database copies. For more information about the memory guidelines, see Understanding Memory Configurations and Exchange Performance. This measurement includes all database volume I/O (database, content indexing and NTFS metadata); it doesn't include log volume I/O.

Mailbox database I/O read/write ratios

Messages sent/received per mailbox per day Stand-alone databases Databases participating in mailbox resiliency

50

1:1

3:2

100

1:1

3:2

150

1:1

3:2

200

1:1

3:2

250

1:1

3:2

300

2:3

1:1

350

2:3

1:1

400

2:3

1:1

450

2:3

1:1

500

2:3

1:1

For example, if you deploy 24,000 mailboxes across Mailbox servers within a database availability group (DAG) that maintains three database copies, each database has a database read to write ratio of 3:2. Or, in other words, 60 percent of all I/Os to the logical unit number (LUN) hosting the database are read I/Os.

Having more writes as a percentage of total I/O has specific implications when choosing a redundant array of independent disks (RAID) type that has significant costs associated with writes, such as RAID5 or RAID6. For more information about selecting the appropriate RAID solution for your servers, see Understanding Storage Configuration.

Calculating IOPS per Mailbox server in Exchange 2010 requires more steps than in previous versions of Exchange because of the following:

  • You can now combine databases and logs on the same volume,
  • You can host both active and passive database copies on the same server,
  • The addition of sequential I/O background tasks (for example, background database maintenance).

Pure sequential I/O operations aren't factored in the IOPS per Mailbox server calculation because storage subsystems can handle sequential I/O much more efficiently than random I/O. These operations include background database maintenance, log transactional I/O, and log replication I/O.

IOPS per Mailbox server is calculated slightly differently depending on how your storage is designed:

  • Database files and log files share a single volume.
  • Database files are stored on different disk volumes than the transaction log files.

For both storage designs, use Performance Monitor (perfmon.exe) to measure the peak two hour period (at a 5-second sampling interval). This is the time of day where the system is under the most load generated by client activity (for example, 10 A. M. -12 P. M.). This period is often twice the load of the 10 hour daily average (Peak:Average ratio = 2:1).

In this configuration, the database files and log files are stored on the same disk volume. This example assumes each database is on a different volume backed by a dedicated disk. Fill in the following table for all databases using data from the collected performance monitor log (described in the previous section).

Database Name Logical Disk -> Disk Reads/sec Logical Disk -> Disk Writes/sec MSExchange DatabaseèInstances ->Database Maintenance IO Reads/sec MSExchange DatabaseèInstances ->I/O Database Reads (Recovery)/sec MSExchange DatabaseèInstances ->I/O Database Writes (Recovery)/sec MSExchange DatabaseèInstances ->IO Log Writes/sec

Database 1

Database 2

Database 3

Database 4

Additional databases as required

Total

Add the totals from each column, and then perform the following calculation to determine IOPS per Mailbox server.

Calculation summary   Sum of Logical Disk IO - (sum of database maintenance I/O + recovery (log replay) I/O + Log I/O) divided by the number of mailboxes hosted per server during the performance monitor log measurement.

Calculation Detail   ((Logical Disk -> Disk Reads/sec + Logical Disk -> Disk Writes/sec) - (MSExchange Database ==> Instances -> Database Maintenance IO Reads/sec + MSExchange Database ==> Instances -> I/O Database Reads (Recovery)/sec + MSExchange Database ==> Instances -> I/O Database writes (Recovery)/sec + MSExchange Database ==> Instances -> IO Log Writes/sec + MSExchange Database ==> Instances -> IO Log Writes/sec))/ Number of mailboxes hosted per server during the performance monitor log measurement = IOPS per Mailbox server.

In this configuration, the database files are stored on different disk volumes than the transaction log files. This example assumes each database is on a different volume backed by a dedicated disk. Fill in the following table for all databases using data from the collected performance monitor log (described in the previous section).

Database Name Logical Disk -> Disk Reads/sec Logical Disk -> Disk Writes/sec MSExchange Database ==> Instances ->Database Maintenance IO Reads/sec MSExchange Database ==> Instances ->I/O Database Reads (Recovery)/sec MSExchange Database ==> Instances ->I/O Database Writes (Recovery)/sec

Database 1

Database 2

Database 3

Database 4

Additional databases as required

Total

Add the totals from each column, and then perform the following calculation to determine IOPS per Mailbox server.

Calculation summary   Sum of Logical Disk IO - (Sum of database maintenance I/O + and recovery (log replay) I/O) divided by the number of mailboxes hosted per server during the perfmon log measurement.

Calculation Detail   ((Logical Disk -> Disk Reads/sec + Logical Disk ->Disk Writes/sec) - (MSExchange Database ==> Instances -> Database Maintenance I/O Reads/sec + MSExchange Database ==> Instances -> I/O Database Reads (Recovery)/sec + MSExchange Database ==> Instances -> I/O Database writes (Recovery)/sec))/ Number of mailboxes hosted per server during the performance monitor log measurement = IOPS per Mailbox server.

If you're using a previous version of Exchange and have calculated your baseline IOPS, keep in mind that Exchange 2010 will affect your baseline in the following ways:

  • The number of users on the server will affect the overall database cache per user.
  • The amount of RAM influences how large your database cache can grow, and a larger database cache results in more cache read hits, thereby reducing your database read I/O

The key is that knowing your IOPS on a specific server isn't enough information to plan an entire enterprise because the amount of RAM, number of users, and number of databases will be different on each server. After you have your actual IOPS numbers, always apply a 20 percent I/O overhead factor to your calculations to add some reserve capacity. You don't want a poor user experience because activity is heavier than normal.

Unlike Cached Exchange Mode clients, all Online Mode client operations occur against the database. Due to the changes in the store schema and Extensible Storage Engine (ESE), Outlook Online Mode clients now generate the same I/O profile as Outlook Cached Exchange Mode Clients.

In terms of mailbox search capabilities, end users have two options:

  • They can use the built-in content index that's available on the Mailbox server.
  • They can install a desktop search engine client and have a local index generated on the client of the mailbox's data and perform local searches.

End users that use desktop search engine clients with Outlook Online Mode may incur additional read I/O operations against the database. Currently, the only known desktop search engine that doesn't incur additional read I/Os is Windows Desktop Search 4.0. Windows Desktop Search 4.0 uses synchronization protocols that are similar to how Outlook Cached Exchange Mode synchronization protocols index the mailbox contents.

Therefore, use the following guidelines if you intend to deploy Outlook Online Mode clients with desktop search engines other than Windows Desktop Search 4.0:

  • 256 MB Online Mode clients will increase database read operations by a factor of 1.5 when compared with Cached Exchange Mode clients. Below 256 MB, the impact is negligible.
  • As mailbox size doubles, the database read IOPS will also double (assuming equal item distribution between key folders remains the same).

As a result of this data, we have two recommendations:

  • Deploy Cached Exchange Mode clients where appropriate. For more information, see the "Item Count per Folder" section later in this topic. Otherwise, replace the desktop search engine with Windows Desktop Search 4.0.
  • Consider the I/O requirements when you're designing the database storage.

For additional IOPS factors, such as third-party clients, see Optimizing Storage for Exchange Server 2003.

Log volume I/O is I/O associated with database logging read/write activity and NTFS metadata read/write activity. Log volume I/O is sequential in nature and, when using a battery-backed write caching array controller, the I/O overhead of log volume I/O is minimal and not a significant factor for Exchange storage sizing.

In Exchange 2010, a transaction log for a database requires approximately 10 percent as many I/Os as the databases in the database LUN. For example, if the database LUN is using 1,000 I/Os, the log LUN would use approximately 100 I/Os.

With the reduction in database reads in Exchange 2010, combined with the smaller log file size and the ability to have more databases, the log-to-database write is 40 percent for stand-alone databases and 50 percent for databases participating in mailbox resiliency. For example, if the database that's participating in mailbox resiliency is consuming 12 write I/Os, the log LUN will consume approximately 6 write I/Os.

On Mailbox servers that are hosting databases that are participating in mailbox resiliency, there is overhead associated with using continuous replication. Closed transaction logs must be read and sent to the target database copies. This overhead is an additional 10 percent in log reads for each active database copy that's hosted on the Mailbox server. For example, if the Mailbox server is hosting 10 active database copies, and each transaction log stream is generating 6 write I/Os, you can expect an additional 0.6 read I/Os for each of those 10 active database copies (or a total of 6 read I/Os).

After measuring or predicting the transactional log I/O, apply a 20 percent I/O overhead factor to ensure adequate room for busier than normal periods.

One way to reduce server I/O is to use Outlook in Cached Exchange Mode. The initial mailbox synchronization is a disk intensive operation, but over time, as the mailbox size grows, the disk subsystem burden is shifted from the Exchange server to the Outlook client. With use of Cached Exchange Mode, having a large number of items in a user's Inbox or a user searching a mailbox will have little effect on the server. This approach also means that Cached Exchange Mode users with large mailboxes may need faster computers than those with small mailboxes (depending on the individual user threshold for acceptable performance).

When you deploy client computers that are running Outlook 2007 in Cached Exchange Mode, consider the following guidelines with respect to mailbox/.ost file sizes:

  • Up to 5 gigabytes (GB)   This size should provide a good user experience on most hardware.
  • Between 5 GB and 10 GB   This size is typically hardware dependent. Therefore, if you have a fast hard disk and a lot of RAM, your experience will be better. However, slower hard drives, such as drives that are typically found on laptops or early generation solid-state drives (SSDs), experience some application pauses when the drives respond.
  • More than 10 GB   This is the size at which short pauses begin to occur on most hardware.
  • Very large, such as 25 GB or larger   This size increases the frequency of the short pauses, especially while you're downloading new e-mail messages. Alternatively, you can use Send/Receive groups to manually synchronize your mail.

This guidance is based on the installation of a cumulative update for Outlook 2007 Service Pack 1 or later, as described in Microsoft Knowledge Base Article 961752, Description of the Outlook 2007 hotfix package (Outlook.msp): February 24, 2009.

If you experience performance-related issues with Outlook 2007 in Cached Exchange Mode deployment, see Knowledge Base Article 940226, How to troubleshoot performance issues in Outlook 2007. For more information about the improvements that are available, see Knowledge Base article 968009, Outlook 2007 improvements in the February 2009 cumulative update.

Both Outlook Web App and Outlook in Online Mode store indexes on and search against the server's copy of the data. For moderately sized mailboxes, this results in approximately double the IOPS per mailbox of a comparably sized Cached Exchange Mode client. The IOPS per mailbox for large mailboxes is even higher. The first time you sort a view in a new way, an index is created, causing many read I/Os to the database LUN. Subsequent sorts on an active index are inexpensive.

A challenging scenario occurs when a user has exceeded the number of indexes that Exchange will store, which is 11 indexes in Exchange 2010. When the user chooses to sort a new way, thereby creating a twelfth index, it causes additional disk I/O. Because the index isn't stored, this disk I/O cost occurs every time that sort is done. Because of the high I/O that can be generated in this scenario, we strongly recommend storing no more than 100,000 items in core folders, such as the Inbox and Sent Items folders. Creating more top-level folders, or subfolders underneath the Inbox and Sent Items folders, greatly reduces the costs associated with this index creation, provided the number of items in any one folder doesn't exceed 100,000.

In Exchange 2010, messages are indexed as they're received, causing little database disk I/O overhead (because the message is still in the database cache when it's retrieved for indexing). However, write I/O is associated with updating the search catalog store. Due to the overall database I/O reductions in Exchange 2010, the percentage of search catalog I/O is now 10 percent to 15 percent of the database files I/O (depending upon profile). Search catalog read I/O occurs when clients issue search queries, and it's a rare enough occurrence to not be relevant to Exchange 2010 storage design.

Return to top

Transactional I/O occurs in response to direct user action and usually has the highest priority, and therefore, it's the focus for storage design. Non-transactional I/O either occurs in the background and is tuned to have a minimal performance impact, or it occurs during a defined maintenance window.

The following sections discuss some of the non-transactional I/O that occurs in the background. Although non-transactional I/O isn't the focus of storage design, it can impact your storage design. For more information, see New Exchange Core Store Functionality.

Background database maintenance I/O is sequential database file I/O associated with checksumming both active and passive database copies. Background database maintenance has the following characteristics:

  • On active databases, it can be configured to run either 24 × 7 or during the online maintenance window. Background database maintenance (Checksum) runs against passive database copies 24 × 7. For more information, see "Online Database Scanning" in the New Exchange Core Store Functionality topic.
  • Reads approximately 5 MB per second for each actively scanning database (both active and passive copies). The I/O is 100 percent sequential, so the storage subsystem can process the I/Os efficiently.
  • Stops scanning the database if the checksum pass completes in less than 24 hours.
  • Issues a warning event if the scan doesn't complete within three days (not configurable).

Messaging records management (MRM) is the records management technology in Exchange 2010 that helps organizations reduce the legal risks associated with e-mail. MRM makes it easier to retain the messages that are needed to comply with company policy, government regulations, or legal needs, and to remove content that has no legal or business value.

These actions are accomplished through the use of retention policies or managed folders. The Managed Folder Assistant is a Microsoft Exchange Mailbox Assistant that applies message retention settings configured in retention policies or managed folder mailbox policies. The disk I/O required by the assistant depends on the number of mailbox items processed. We recommend that the assistant not run at the same time as either backup or online maintenance. For more information, see Schedule the Managed Folder Assistant.

You can use the Exchange Management Tools to set the maintenance schedule for a database or allow 24 × 7 database maintenance. Online defragmentation no longer works in Exchange 2010 as it did in previous versions of Exchange. Online defragmentation is continuously performed while the database is being read from and written to. For more information, see "Online Database Scanning" in the New Exchange Core Store Functionality.

Return to top

Was this page helpful?
(1500 characters remaining)
Thank you for your feedback
Show:
© 2014 Microsoft