Understanding Database and Log Performance Factors

[Este tema se está redactando.]

Última modificación del tema: 2009-12-02

This topic discusses database and log I/O performance operations in Microsoft Exchange Server 2010.

Transactional I/O

Transactional I/O is generally defined as the I/O generated by user activity. For Examples of user activity include receiving, sending, and deleting items; syncing a Windows Mobile client; or logging on via Outlook Web App. Transactional I/O is a critical piece of Exchange 2010 storage design because the I/O latency (how long it takes to execute the I/O operation) can directly impact the user experience of online clients such as Outlook Online Mode and Outlook Web App. Outlook Exchange Cached Mode can also be affected by high I/O latency when it's being used for tasks such as delegate access and configuring rules. All clients can be affected by e-mail delivery delays caused by high latency I/O. Transactional I/O can be divided in to database volume I/O and log volume I/O.

The transactional I/O requirements in Exchange 2010 have been reduced since Exchange Server 2007. Not all I/O that occurs against the Mailbox database and log volumes is considered transactional. For more information, see Understanding the Exchange 2010 Store.

Understanding IOPS

For all versions of Exchange, one of the key metrics needed for sizing storage is the amount of database I/O per second (IOPS) consumed by each user.

Database Cache

A 64-bit Windows Server operating system running the 64-bit version of Exchange 2010 substantially increases the virtual address space, and allows Exchange to increase its database cache, reduce database read I/O, and enable up to 100 databases per server.

The database read reduction depends on the amount of database cache available to the server and the user message profile. For guidance on memory and databases, see Descripción de la memoria caché de la base de datos de buzones. Following the guidance in that topic can result in up to a 90 percent transactional I/O reduction over Exchange 2003. The amount of database cache per user is a key factor in the actual I/O reduction.

The following table demonstrates the increase in actual database cache per mailbox when comparing the default 900 MB in Exchange 2003 versus 6 MB of database cache per mailbox in Exchange 2010 for a user population that leverages a 100 messages / day profile. It is this additional database cache that enables more read hits in cache, thus reducing database reads at the disk level.

Database cache sizes based on mailbox count

Mailbox count Exchange 2003 database cache per mailbox (MB) Exchange 2010 database cache per mailbox (MB) Database cache increase over Exchange 2003

4000

0.225

6

27 times

2000

0.45

6

13 times

1000

0.9

6

7 times

500

1.8

6

5 times

Determining the Exchange 2010 Mailbox IOPS Profile

The two largest factors that can be used to predict Exchange 2010 database IOPS are the amount of database cache per user and the number of messages each user sends and receives per day. The following formula is based on a standard worker who uses Office Outlook 2010 in Cached Exchange Mode, and it has been tested to be accurate within plus or minus 20 percent. Other client types and usage scenarios may yield inaccurate results. The predictions are only valid for user database cache sizes between 3 MB and 30 MB. The formula has not been validated with users sending and receiving over 500 messages per day. The average message size for formula validation was 75 KB, but message size is not a primary factor for IOPS. The following table provides estimated values for IOPS per user that you can use to predict your baseline Exchange 2010 IOPS requirements and includes all database I/O (database, content indexing and NTFS metadata). It does not include log volume I/O.

Database cache and estimated IOPS per user based on user profile and message activity

Messages Sent/Received per mailbox per day (~75KB average message size) Database Cache per mailbox (MB) Single Database copy (Standalone): Estimated IOPS per mailbox Multiple Database Copies (Mailbox Resiliency): Estimated IOPS per mailbox

50

3

0.60

0.50

100

6

0.120

0.100

150

9

0.18

0.150

200

12

0.240

0.200

250

15

0.300

0.250

300

18

0.360

0.300

350

21

0.420

0.350

400

24

0.480

0.400

450

27

0.540

0.450

500

30

0.600

0.500

Database Volume I/O

Database volume I/O is I/O associated with database file (.edb) read/write activity, content indexing read/write activity, as well as NTFS metadata read/write activity.

In Exchange 2003, the database read/write ratio is typically 2:1 or 66 percent reads. With Exchange 2010, the larger database cache decreases the number of reads to the database on disk causing the reads to shrink as a percentage of total I/O.

If you follow the recommended memory guidelines you can expect to see the following I/O ratios for active database copies; this measurement includes all database volume I/O (database, Content Indexing and NTFS metadata); it does not include log volume I/O:

Mailbox database I/O read/write ratios

Messages Sent/Received per mailbox per day Standalone Databases Databases participating in Mailbox Resiliency

50

1:1

3:2

100

1:1

3:2

150

1:1

3:2

200

1:1

3:2

250

1:1

3:2

300

2:3

1:1

350

2:3

1:1

400

2:3

1:1

450

2:3

1:1

500

2:3

1:1

For example, deploying 24,000 mailboxes across mailbox servers within a database availability group that maintains three database copies, each database has a database read to write ratio of 3:2, or in other words, 60% of all IOs to the LUN hosting the database are read IOs.

Having more writes as a percentage of total I/O has particular implications when choosing a redundant array of independent disks (RAID) type that has significant costs associated with writes, such as RAID5 or RAID6. For more information about selecting the appropriate RAID solution for your servers, see Understanding Storage Configuration.

Effect of Desktop Search Engines with Outlook Online Mode Clients

Unlike Cached Exchange Mode clients, all Online Mode client operations occur against the database. Due to the changes in the store schema and Extensible Storage Engine, Outlook Online Mode clients now generate the same I/O profile as Outlook Cached Mode Clients.

In terms of mailbox search capabilities, end users have two options:

  • They can leverage the built-in content index that is available on the mailbox server.
  • They can install a desktop search engine client and have a local index generated on the client of the mailbox’s data and perform local searches.

End users that leverage desktop search engine clients with Outlook Online Mode have the potential to incur additional read I/O operations against the database. At the time of this writing, the only known desktop search engine that does not incur additional read IOs is Windows Desktop Search 4.0 or later. Windows Desktop Search 4.0 or later utilizes synchronization protocols that are similar to how Outlook’s cached mode synchronization protocols index the mailbox contents.

Therefore, the following guidelines have been established if you intend to deploy Outlook Online Mode clients with desktop search engines other than Windows Desktop Search 4.0 or later:

  • 256 MB Online Mode clients will increase database read operations by a factor of 1.5 when compared with Cached Exchange Mode clients. Below 256 MB, the impact is negligible.
  • As mailbox size doubles, the database read IOPS will also double (assuming equal item distribution between key folders remains the same).

As a result of this data, two recommendations can be made:

  1. Deploy cached mode clients where appropriate. See the "Item Count per Folder" section below for more information. Otherwise replace the desktop search engine with Windows Desktop Search 4.0 or later.
  2. Ensure that the I/O requirements are taken into consideration when designing the database storage.

For additional IOPS factors, such as third-party clients, see Optimizing Storage for Exchange Server 2003.

Log Volume I/O

Log volume I/O is I/O associated with Database Logging read/write activity and NTFS metadata read/write activity. Log volume I/O is sequential in nature and when using a battery backed write caching array controller, the I/O overhead of log volume I/O is minimal and not a significant factor for Exchange storage sizing.

In Exchange 2003, a transaction log for a database requires approximately 10 percent as many I/O's as the databases in the database. For example, if the database LUN is using 1,000 IOs, the log LUN would use approximately 100 I/O's.

With the reduction in database reads in Exchange 2010, combined with the smaller log file size and the ability to have more databases, the log-to-database write is 40% for standalone databases and 50% for databases participating in mailbox resiliency. For example, if the database that is participating in mailbox resiliency is consuming 12 write I/O's, the log LUN will consume approximately 6 write IOs.

On mailbox servers that are hosting databases that are participating in mailbox resiliency, there is overhead associated with using continuous replication - closed transaction logs must be read and sent to the target database copies. This overhead is an additional 10 percent in log reads for each active database copy that is hosted on the mailbox server. For example, if the mailbox server is hosting 10 active database copies, and each transaction log stream is generating 6 write IOs, you can expect an additional .6 read IOs for each of those 10 active database copies (or a total of 6 read IOs).

After measuring or predicting the transactional log I/O, apply a 20 percent I/O overhead factor to ensure adequate room for busier than normal periods.

Item Count per Folder

One way to reduce server I/O is to use Outlook in Cached Exchange Mode. The initial mailbox synchronization is an expensive operation, but over time, as the mailbox size grows, the disk subsystem burden is shifted from the Exchange server to the Outlook client. This means that having a large number of items in a user's Inbox, or a user searching a mailbox will have little effect on the server. This also means that Cached Exchange Mode users with large mailboxes may need faster computers than those with small mailboxes (depending on the individual user threshold for acceptable performance). When you deploy client computers that are running Outlook 2007 in Cached Exchange Mode, consider the following with respect to mailbox /.OST sizes:

  • Up to 5 gigabytes (GB): This size should provide a good user experience on most hardware.
  • Between 5 GB and 10 GB: This size is typically hardware dependent. Therefore, if you have a fast hard disk and much RAM, your experience will be better. However, slower hard drives, such as drives that are typically found on portable computers or early generation solid state drives (SSDs), experience some application pauses when the drives respond.
  • More than 10 GB: This is the size at which short pauses begin to occur on most hardware.
  • Very large, such as 25 GB or larger: This size increases the frequency of the short pauses, especially while you are downloading new e-mail. Alternatively, you can use Send/Receive groups to manually sync your mail.

Nota

This guidance is based on the installation of a cumulative update for Outlook 2007 Service Pack 1 or later, as described in Microsoft Knowledge Base Article 961752, Description of the Outlook 2007 hotfix package ( Outlook.msp ): February 24, 2009,.

If you experience performance-related issues with your Outlook 2007 in Cached Exchange Mode deployment, see Knowledge Base Article 940226, How to troubleshoot performance issues in Outlook 2007.

Both Outlook Web Access and Outlook in Online Mode store indexes on and search against the server's copy of the data. For moderately sized mailboxes, this results in approximately double the IOPS per mailbox of a comparably sized Cached Exchange Mode client. The IOPS per mailbox for large mailboxes is even higher. The first time you sort a view in a new way, an index is created, causing many read IOs to the database LUN. Subsequent sorts on an active index are inexpensive.

A challenging scenario is when a user has gone beyond the number of indexes that Exchange will store, which for Exchange 2010 is 11 indexes. When the user chooses to sort a new way, thereby creating a twelfth index, it causes additional disk I/O. Because the index is not stored, this disk I/O cost occurs every time that sort is done. Because of the high I/O that can be generated in this scenario, we strongly recommend storing no more than 100,000 items in core folders, such as the Inbox and Sent Items folders. Creating more top-level folders, or subfolders underneath the Inbox and Sent Items folders, greatly reduces the costs associated with this index creation, so long as the number of items in any one folder does not exceed 100,000.

For more information about the improvements that are available, see Knowledge Base article 968009, Outlook 2007 improvements in the February 2009 cumulative update.

Content Index I/O

In Exchange 2010, messages are indexed as they are received, causing little database disk I/O overhead (since the message is still in the database cache when it is retrieved for indexing). There is, however, write I/O associated with updating the search catalog store. Due to the overall Database I/O reductions in Exchange 2010, the percentage of search catalog I/O is now 10-15% of the database files I/O (depending upon profile). Search catalog read I/O occurs when clients issue search queries and is a rare enough occurrence to not be interesting for Exchange 2010 storage design.

Non-Transactional I/O

Transactional I/O occurs in response to direct user action and usually has the highest priority, and it is the focus for storage design. Non-Transactional either occurs in the background (and it tuned to have a minimal performance impact) or occurs during a defined maintenance window.

Background Database Maintenance (Check Summing)

Background database maintenance I/O is sequential database file I/O associated with checksumming both active and passive database copies. Background database maintenance has the following characteristics:

  • On active databases, it can be configured to run either 24x7 or during the online maintenance (OLM) window. Background database maintenance (Checksum) runs against passive database copies 24x7. For more information, see the Online Database Scanning section in the New Exchange Core Store Functionality topic.
  • Reads ~5MB/sec for each actively scanning database (both active and passive copies). The I/O is 100% sequential so the storage subsystem can process the I/O's very efficiently.
  • Will stop scanning the database if the checksum pass completes in less than 24 hours
  • Will fire a warning event if the scan does not complete within 3 days (not configurable).

High Availability

  • High Availability Log Replication I/O: Sequential log file I/O associated with replicating database log files from active copies to passive copies.
  • High Availability Log Replay I/O: Sequential/Random I/O associated with inspecting and replaying replicated logs in to passive database copies.

Messaging Records Management

Messaging Records Management (MRM) is the records management technology in Exchange 2010 that helps organizations reduce the legal risks associated with e-mail. MRM makes it easier to keep the messages that are needed to comply with company policy, government regulations, or legal needs, and to remove content that has no legal or business value. This is accomplished through the use of retention policies or managed folders. The Managed Folder Assistant (MFA) is a mailbox assistant that runs against a mailbox database. The disk I/O required by the assistant depends on the number of mailbox items processed. We recommend that the assistant not run at the same time as either backup or online maintenance.

Online Maintenance

You can use the Exchange Management Tools to set the maintenance schedule for a database or allow 24 x 7 database maintenance. Online defragmentation no longer works like in previous versions of Exchange. Online defragmentation is continuously performed while the database is being read from and written to. For more information, see the Online Database Scanning section in the New Exchange Core Store Functionality.