What Causes Exchange Disk I/O

 

Every time data is read from or written to Exchange, disk I/O is generated. Understanding the sources of Exchange disk I/O helps you plan and configure your disk subsystem in a way that maximizes performance. When considering the sources of Exchange disk I/O, focus most of your attention on the I/O behavior that is generated during log file and database file access.

Exchange Data Components

All Exchange data is stored in the Exchange store, which is composed of three major components. The following table lists the three major components of the Exchange store and their affect on disk I/O.

Exchange store components and corresponding affect on disk I/O

Component Why it effects disk I/O

Jet database (.edb file)

The Jet database is used to store all data submitted from MAPI clients. All client activity generated by a MAPI client causes updates to the Jet database.

Stores incoming SMTP mail that contains MAPI information.

Streaming database (.stm file)

Stores attachments and data submitted from IMAP4, NNTP, Microsoft Outlook® Web Access, or SMTP. Pointers are saved in the Jet database so the data can be delivered to MAPI clients upon request.

Stores incoming SMTP mail that does not contain MAPI information.

All Internet protocol client activity causes updates to the streaming database.

Transaction log files (.log files)

All changes made to the database are first committed to transaction log files. This means that any time a user sends or reads a message, and any time a user modifies data stored in their mailbox, that change is written to the transaction log file. The change is immediately committed to the in-RAM database cache, and then copied back to disk when the system’s load permits. Transactions are also read back when a database is mounted.

Because each Exchange store component is written to differently, you will experience better performance if you place the .edb files and corresponding .stm files for one storage group on one volume, and place your transaction log files on a separate volume. The following table lists how the disk read/writes are performed for each Exchange store component.

Disk read/writes for each Exchange store component

Component I/O Pattern

Jet database (.edb file)

  • Read from and write to at random

  • 4 KB page size

Streaming database (.stm file)

  • Normally read from and write to sequentially

  • Variable page size that averages 8 KB in production

Note

There are significant numbers of seek operations, so the I/O pattern is neither entirely random nor entirely sequential.

Transaction log files (.log files)

  • 100 percent sequential writes during normal operations

  • 100 percent sequential reads during recovery operations

  • Writes vary in size from 512 bytes to the log buffer size

Note

If public folders reside on the server, additional I/O loads are incurred. However, if no replicas of folder content exist on the server, the I/O generated from having a public folder database is inconsequential relative to the I/O generated by user mailbox access.

Other Activities That Affect I/O

In addition to database file access, there are other activities that result in disk I/O. The following table lists these additional activities and their affect on disk I/O.

Additional activities that effect disk I/O

Activity Why it effects disk I/O

Zero out deleted database pages

If you configure your server to zero out deleted database pages, every time you delete an item from the database, multiple pages are deleted. Exchange will then overwrite the deleted pages with zeros. This feature is only run during an online streaming backup, and causes more physical disk I/O during backup.

Content indexing

Excessive paging occurs as the databases are scanned for index updates. This also results in excessive writes to the content indexing file, generating a spike in disk I/O on the disk that houses the content indexing file.

SMTP mail transmission

Inbound and outbound SMTP traffic is written to disk while transiting the system. If this disk is shared with user database or log files, SMTP mail that is being written to or read from disk will contend with the database and/or log files for I/O.

As SMTP messages are spooled from the queue to the first Exchange database, they are converted from MIME to IMail. This generates additional I/O on the log file and database volumes being used by the database.

Paging

When Exchange requires more memory than is physically available, Windows pages information to the page file located on your hard disk. Excessive paging results in excessive disk I/O, which reduces the performance of your Exchange server. For information about troubleshooting paging and kernel memory depletion issues, see Ruling out Memory-Bound Problems (https://go.microsoft.com/fwlink/?LinkId=62575).

MTA message handling

In Exchange 2000 and Exchange 2003, messages that are received during move mailbox operations, or from Exchange 5.5 servers, X.400-based servers, or across EDK connectors, are written to the transaction log files. This causes disk I/O load to your storage system.

Number of Items in a Folder

As the number of items in the core Exchange 2003 folders increase, the physical disk cost to perform some tasks will also increase for users of Outlook in online mode. Indexes and searches are performed on the client when using Outlook in cached mode. Sorting your Inbox by size for the first time requires the creation of a new index, which will require many disk I/Os. Future sorts of the Inbox by size will be very inexpensive. There is a static number of indexes that you can have, so folks that often sort their folders in many different ways could exceed this limit and cause additional disk I/O.

Number of Mailboxes

In Exchange, database reads can be reduced somewhat by the database cache which is a fixed amount of cache. As you add more mailboxes to an Exchange Server, there are more mailboxes competing for that cache. Every 1000 users added to an Exchange Server over a baseline 1000, will increase the database IOPS by 25 percent.

Exchange Version

When migrating from Exchange 5.5 to Exchange 2000 SP3, you can expect to see a 5 percent increase in database IOPS, all other factors constant.

When migrating from Exchange 5.5 to Exchange 2003 SP1, you can expect to see a 20 percent decrease in database IOPS, all other factors constant.

Single Instance Storage

In Exchange 5.5, there is one database on the server.  Mail sent to multiple mailboxes on that server is only stored once, with pointers delivered to each recipient.  In Exchange 2000 and Exchange 2003, you can have up to 20 databases, where each database could have one copy of the message should recipients reside on each database. Each additional database adds an additional 2 percent to the database IOPS. How well Exchange utilizes single instance storage depends on the percentage of time messages are sent to recipients on the same database, and the average message size. Larger messages have more benefit with single instance storage.

BlackBerry

In Exchange 2000 and Exchange 2003, users that have BlackBerry devices place additional demands upon the server. In the field, many customers see a two to four fold increase in database disk I/O. For more information, see the RIM whitepaper.

BlackBerry users cause additional overhead that affect the database IOPS of a server. When RIM tested 1000 BlackBerry enabled MMB2 users with BlackBerry Enterprise Server 4, they saw database IOPS increase by a factor of 3.64 over the standard MMB2 user without BlackBerry. This factor could be significantly smaller or larger depending on how BlackBerry devices are used in the environment. The BlackBerry test included: 10 synchronization commands; two memo adds, one modify, one delete; and four task adds. Actual BlackBerry device use will not be this constant, causing a lesser or greater affect on actual IOPS. 

For a mail system consisting of 2,000 heavily used mailboxes, of which 500 are BlackBerry enabled, a total of 3820 IOPS is projected on the database volume. The formula to calculate this is:

Estimated BlackBerry IOPS per User for User Type × Number of Users

In this example, 1.0 IOPS × 2,000 mailboxes=2,000 IOPS. If 500 of those users have BlackBerry devices, then those 500 users add 500 mailboxes x 3.64 IOPS=1820 IOPS, or 3820 total IOPS.

Using a conservative ratio of two reads for every write (66% reads to 33% writes), you would plan for 2,546 read I/O and 1,273 write I/O requests per second for your database volume. Every write request is first written to the transaction log file and then written to the database. Approximately 10 percent of the total 3,820 IOPS seen on the database volume will be seen on the transaction log volume (10 percent of 3,820 is 382 IOPS); 1,273 write I/O requests will be written to the database. See the Performance and Scalability guide for in depth strategies for properly calculating your server size.

Best Practices for Optimizing Exchange Disk I/O

After familiarizing yourself with the Exchange activities that generate disk I/O, you should organize your storage system in a way that maximizes performance. The following table lists best practices for placement of each of your data files.

Best practices for optimizing disk I/O

Source of Exchange I/O Best Practice to maximize performance

Database files

You should place all database files (.edb and .stm) within a storage group on a single volume that is dedicated to these databases. Disks that hold database files should have fast random access speeds.

Content indexing files

Never place content indexing files on the same disk as the page file (although that is the default location). Because the content indexing file is a random-access file, it can be placed on the same volume as the databases, provided that the disk subsystem can handle the load.

Single Instance Storage

To maximize the benefit of single instance storage, mailboxes that belong to the same workgroups and distribution lists should be housed on the same database.

Transaction log files

Because all transactions are first written to the transaction logs, transaction logs should be on a storage device that has the lowest possible write latency. Generally, the lowest write latency can be obtained by dedicating a RAID-1/RAID-1+0 set to the logs of a single storage group. This avoids mixing data streams with other I/O and ensures 100 percent sequential write I/O, which ensures the highest disk throughput and lowest latency. Storage arrays with effective mirrored, battery-backed write-back caching may not exhibit any performance improvement by doing this because all write I/O is written first to cache, is coalesced, and then written to disk (providing low write latency since the write is returned to the OS as successful once it makes it in to the storage cache). Jetstress can be used to measure whether this type of I/O isolation provides better performance for a given storage platform/configuration.

For optimal recoverability, it is recommended that you place the transaction logs and databases for a given storage group on to different LUNs (logical unit numbers); configured as RAID-1/RAID-1+0. In addition, it is recommended that the separate LUNs hosting the logs and database files for the same storage group do not share physical disks within the array. This is to increase the level of recoverability, allowing you to recover from multiple disk failure scenarios with minimal data loss (either the disks backing the logs or the disks backing the databases are lost; but not both). If you lose the log LUN, you still have a nearly up-to-date database; if you lose the database LUN, you can restore from backup and play the logs forward to bring the database up-to-date. If you place both logs and databases from the same storage group on the same LUN, or different LUNs that share the same physical disks, you have the potential to lose both the logs and database during a multiple disk failure scenario. This will cause you to lose all changes since the last successful backup. The recommendation of separating logs and databases within a given storage group to different physical disks is less important in scenarios where the Exchange databases and logs are replicated through synchronous replication to an alternate storage array (the database and logs can be recovered from the alternate storage array after a multiple disk failure scenario).

If your hardware RAID controller has a mirrored, battery-backed, write-back cache, and it allows you to tune the read/write cache ratio, set the ratio to 75 percent write, 25 percent read.

SMTP queue

You should use a RAID-1+0 array with multiple disk spindles for the SMTP queue volume. The number of disk spindles and the size of the write cache should be based on the expected SMTP message throughput of the server.

The SMTP queue should never be on any spindle that performs another function (such as transaction logs, database files, page files, or system files).

Whether or not a message is destined to a mailbox on the same server or on a remote server has little effect on SMTP queue-related disk I/O.

Page file

For optimal performance, you should place your page file on separate spindles, and it should be located on at least a RAID-1 device. If you lose the disk with the page file, the server will experience a stop error.

MTA queue

The message transfer agent (MTA) queue should never reside on a log or database volume. If your server handles a significant amount of SMTP and/or MTA traffic, you should provide a separate set of spindles for the SMTP and MTA queues.

For example, if you have a computer running Exchange 2003 that contains one storage group with five databases, you should configure the following separate, physical RAID arrays:

  • C:\ - System volume, operating system, Exchange system files - RAID-1 (direct-attached storage, not SAN)

  • D:\ - Page file - RAID-1 (direct-attached storage, not SAN)

  • E:\ - SMTP and MTA queues - RAID-1+0 (SAN)

  • F:\ - Log files from storage group 1 - RAID-1 (SAN)

  • G:\ - Databases from storage group 1 - RAID-1+0 (SAN)