Export (0) Print
Expand All

Monitoring Mailbox Servers

 

Applies to: Exchange Server 2007 SP3, Exchange Server 2007 SP2, Exchange Server 2007 SP1

Topic Last Modified: 2009-04-13

This topic provides guidance about the most useful performance counters to monitor on servers running Microsoft Exchange Server 2007 with the Mailbox server role installed. When monitoring Exchange 2007 servers, you should know which performance aspects are most important. The counters and threshold values detailed in this topic can be used to proactively identify potential issues and help identify the root cause of issues when troubleshooting.

Although Exchange 2007 relies less upon disk input/output (I/O) than prior versions of Exchange, disk response time is still important for many critical system functions. Disk issues have historically been the primary cause of Exchange performance issues.

Use the counters listed in the following table to determine whether there are any database disk-related issues.

Performance counters for database disks

 

Counter Expected values

LogicalDisk(*)\Avg. Disk sec/Read

PhysicalDisk(*)\Avg. Disk sec/Read

Shows the average time, in seconds, of a read of data from the disk.

noteNote:
When looking at disks using Perfmon.exe, an understanding of the underlying disk subsystem is key to determining which counters (physical disk or logical disk) to look at. Windows Clustering can use volume mount points to overcome the 26-drive limitation of the operating system, so drives may show up as numbers indicating physical disks rather than having drive letters. For more information about volume mount points, see Volume Mount Points and File Systems.

Should be below 20 milliseconds (ms) at all times on average.

For servers with more than 1,000 users, 20-ms disk times may not be fast enough to return responses to the client to accommodate user load. Check remote procedure call (RPC) averaged latencies to ensure these are within recommended values and adjust the disk subsystem for increased I/Os.

LogicalDisk(*)\Avg. Disk sec/Write

PhysicalDisk(*)\Avg. Disk sec/Write

Shows the average time, in seconds, of a write of data to the disk.

noteNote:
When looking at disks using Perfmon.exe, an understanding of the underlying disk subsystem is key to determining which counters (physical disk or logical disk) to look at. The Cluster service can use volume mount points to overcome the 26-drive limitation of the operating system, so drives may show up as numbers indicating physical disks rather than having drive letters. For more information about volume mount points, see Volume Mount Points and File Systems.

Should be below 100 ms at all times on average.

If disk writes are high, it is possible that read latencies are also affected as a direct correlation with high write times.

Performance counters for log disks

 

Counter Expected values

LogicalDisk(*)\Avg. Disk sec/Read

Shows the average time, in seconds, of a read of data from the disk.

Should be below 20 ms on average.

LogicalDisk(*)\Avg. Disk sec/Write

Shows the average time, in seconds, of a write of data to the disk.

noteNote:
Processes such as sync replication can increase latencies for this counter.

Should be below 10 ms on average.

Performance counters for TEMP/TMP and page file disks

 

Counter Expected values

LogicalDisk(*)\Avg. Disk sec/Read

Shows the average time, in seconds, of a read of data from the disk.

Should be below 10 ms on average.

Spikes (maximum values) should not be higher than 50 ms.

LogicalDisk(*)\Avg. Disk sec/Write

Shows the average time, in seconds, of a write of data to the disk.

Should be below 10 ms on average.

Spikes (maximum values) should not be higher than 50 ms.

Miscellaneous disk counters

 

Counter Expected values

LogicalDisk(*)\Avg. Disk sec/Transfer

For healthy disks, this counter shows approximately 20 ms. Counter values larger than 20 ms, or with large spikes, indicate a possible disk issue (for example, failure or slow speed).

Should be below 20 ms on average.

Spikes (maximum values) should not be higher than 50 ms.

When you use Microsoft Office Outlook in MAPI mode, Outlook executes client operations as RPCs between the client and the server. If the user is running in online mode, these RPCs occur synchronously. Any delay by the server in fulfilling these synchronous requests directly affects the user experience and the responsiveness of Outlook. In contrast, most operations that are performed when you run in cached mode occur against the user's local copy of the mailbox or are issued to the server in the form of asynchronous (background) RPCs. Generally, asynchronous RPCs do not affect the responsiveness or overall experience of the Outlook client itself. For more information about slow RPC request processing, see Troubleshooting Slow RPC Request Processing Issues.

Use the counters listed in the following table to determine whether there are any information store RPC processing-related issues.

 

Counter Expected values

MSExchangeIS\RPC Requests

Indicates the overall RPC requests that are currently executing within the information store process.

The maximum value in Exchange 2007 is 500 RPC requests that can execute at any designated time before the information store starts rejecting any new connections from clients.

Should be below 70 at all times.

MSExchangeIS\RPC Averaged Latency

Indicates the RPC latency, in milliseconds, averaged for all operations in the last 1,024 packets.

For information about how clients are affected when overall server RPC averaged latencies increase, see RPC Client Throttling.

Should not be higher than 25 ms on average.

To determine if certain protocols are causing overall RPC latencies, monitor MSExchangeIS Client (*)\RPC Average Latency to separate latencies based on client protocol.

Cross-reference MSExchangeIS\RPC Client Backoff/sec to ensure higher latencies are not causing client throttling.

MSExchangeIS\RPC Operations/sec

Indicates the current number of RPC operations that are occurring per second.

Should closely correspond to historical baselines. Values much higher than expected indicate that the workload has changed, while values much lower than expected indicate a bottleneck preventing client requests from reaching the server.

For online mode clients, between .75 and 1 IOPS/Mailbox would be considered a moderate user. For more information about how to calculate this value, see the Mailbox Server Storage Design information in the "Understanding IOPS" section of the How to Measure IOPS per Mailbox topic.

noteNote:
Cached Exchange Mode clients have a slightly higher rate due to other sync-related functions.

MSExchangeIS\RPC Num. of Slow Packets

Shows the number of RPC packets in the past 1,024 packets that have latencies longer than 2 seconds.

Should be less than 1 on average, and should be less than 3 at all times.

MSExchangeIS Client (*)\RPC Average Latency

Shows a server RPC latency, in milliseconds, averaged for the past 1,024 packets for a particular client protocol.

The following is a list of client protocols that can be gathered:

Exchange Administrator

Exchange ActiveSync

Exchange Mailbox Assistants

Exchange Outlook Web Access

Exchange POP-IMAP

Exchange Transport

Exchange Other Clients

Exchange Outlook Anywhere

Exchange Content Indexing

Exchange Availability Service

Exchange Managed Custom Folder Creation

Exchange Management Task

Exchange Monitoring Task

Exchange Unified Messaging

Should be less than 50 ms on average.

Wide disparities between different client types, such as IMAP4, Outlook Anywhere, or Other Clients (MAPI), can help direct troubleshooting to appropriate subcomponents.

MSExchangeIS Client(*)\RPC Operations/sec

Shows which client protocol is performing an excessive amount of RPC Operations/sec.

High IMAP4, POP3, or Outlook Anywhere latency can indicate problems with Client Access servers rather than Mailbox servers. This is especially true when Other Clients (which includes MAPI) latency is lower in comparison.

In some instances, high IMAP latencies could indicate a bottleneck on the Mailbox server in addition to the latencies that the Client Access server is experiencing.

Not applicable.

As a server application, Exchange responds to client requests and attempts to fulfill them as quickly and efficiently as possible. The following counters reveal the number and character of user requests to aid administrators in determining whether client activity is a major factor in Exchange performance issues.

 

Counter Expected values

MSExchangeIS Mailbox(_Total)\Messages Delivered/sec

Shows the rate that messages are delivered to all recipients.

Indicates current message delivery rate to the store.

Not applicable.

MSExchangeIS Mailbox(_Total)\Messages Sent/sec

Shows the rate that messages are sent to transport.

Used to determine current messages sent to transport.

MSExchangeIS Mailbox(_Total)\Messages Submitted/sec

Shows the rate that messages are submitted by clients.

Used to determine current rate that messages are being submitted by clients.

MSExchangeIS Client(*)\JET Log Records/sec

Shows the rate that database log records are generated while processing requests for the client.

Used to determine current load.

MSExchangeIS Client(*)\JET Pages Read/sec

Shows the rate that database pages are read from disk while processing requests for the client.

Used to determine current load.

MSExchangeIS Client(*)\Directory Access: LDAP Reads/sec

Shows the current rate that the Lightweight Directory Access Protocol (LDAP) reads occur while processing requests for the client.

Used to determine the current LDAP read rate per protocol.

MSExchangeIS Client(*)\Directory Access: LDAP Searches/sec

Shows the current rate that the LDAP searches occur while processing requests for the client.

Used to determine the current LDAP search rate per protocol.

Exchange 2007 introduces a new feature to throttle RPC clients to prevent individual clients from overusing server resources. For details about RPC client backoff, see Understanding Client Throttling.

Use the counters listed in the following table to determine whether there are any RPC client throttling-related issues.

 

Counter Expected values

MSExchangeIS\RPC Client Backoff/sec

Shows the rate that the server notifies the client to back off.

Indicates the rate at which client backoffs are occurring.

Higher values may indicate that the server may be incurring a higher load resulting in an increase in overall averaged RPC latencies, causing client throttling to occur.

This can also occur when certain client user actions are being performed. Depending on what the client is doing and the rate at which RPC operations are occurring, it may be normal to see backoffs occurring.

Not applicable.

MSExchangeIS\Client: RPCs Failed:Server Too Busy/sec

Shows the client-reported rate of failed RPCs (since the store was started) due to the Server Too Busy ROC error.

Should be 0 at all times.

Higher values may indicate RPC threads are exhausted or client throttling is occurring for clients running versions of Outlook earlier than Microsoft Office Outlook 2007.

MSExchangeIS\Client: RPCs Failed:Server Too Busy

The client-reported number of failed RPCs (since the store was started) due to the Server Too Busy ROC error.

Should be 0 at all times.

Because Exchange 2007 Mailbox servers depend on Hub Transport servers for message delivery, these counters are crucial for determining issues in the transport layer.

 

Counter Expected values

MSExchangeIS Mailbox(_Total)\Messages Queued for Submission

Shows the current number of submitted messages that are not yet processed by the transport layer.

Should be below 50 at all times.

Should not be sustained for more than 15 minutes.

This may indicate that there are connectivity issues to the transport servers or that backpressure is occurring.

MSExchangeIS Public(_Total)\Messages Queued for Submission

Shows the current number of submitted messages that are not yet processed by the transport layer.

Should be less than 20 at all times.

Exchange is essentially a database application, relying upon transaction logs and database files for data integrity and storage. These counters will indicate issues at the database layer, whether in writing to the database itself, writing to the transaction logs, or in the interaction between the database components themselves.

 

Counter Expected values

MSExchange Database ==> Instances(*)\Log Generation Checkpoint Depth

Represents the amount of work in the log file count that will need to be redone or undone to the database files if the process fails.

Should be below 500 at all times for the Mailbox server role. A healthy server should indicate between 20 and 30 for each storage group instance.

If checkpoint depth increases continually for a sustained period, this is an indicator of either a long-running transaction (which will impact the version store) or of a bottleneck involving the database disks.

Should be below 1,000 at all times for the Edge Transport server role.

MSExchange Database(Information Store)\Database Page Fault Stalls/sec

Shows the rate that database file page requests require of the database cache manager to allocate a new page from the database cache.

This should be 0 at all times.

If this value is non-zero, this indicates that the database is not able to flush dirty pages to the database file fast enough to make pages free for new page allocations.

MSExchange Database(Information Store)\Log Record Stalls/sec

Shows the number of log records that cannot be added to the log buffers per second because the log buffers are full. If this counter is non-zero most of the time, the log buffer size may be a bottleneck.

If I/O log write latencies are high, check for RAID5 or sync replication on log devices.

The average value should be below 10 per second.

Spikes (maximum values) should not be higher than 100 per second.

MSExchange Database(Information Store)\Log Threads Waiting

Shows the number of threads waiting for their data to be written to the log to complete an update of the database. If this number is too high, the log may be a bottleneck.

Should be less than 10 on average.

Regular spikes concurrent with log record stall spikes indicate that the transaction log disks are a bottleneck.

If the value for log threads waiting is more than the spindles available for the logs, there is a bottleneck on the log disks.

MSExchange Database(Information Store)\Version buckets allocated

Shows the total number of version buckets allocated.

The maximum default version is 16,384. If version buckets reach 70 percent of maximum, the server is at risk of running out of the version store.

Should be less than 12,000 at all times.

MSExchange Database Instances(*)\I/O Database Reads Average Latency

Shows the average length of time, in milliseconds, per database read operation.

Should be 20 ms on average.

Should show 50 ms spikes.

MSExchange Database Instances(*)\I/O Database Writes Average Latency

Shows the average length of time, in milliseconds, per database write operation.

Should be 50 ms on average.

Spikes of up to 100 ms are acceptable if not accompanied by database page fault stalls.

MSExchange Database(Information Store)\Database Cache Size (MB)

Shows the amount of system memory, in megabytes, used by the database cache manager to hold commonly used information from the database files to prevent file operations. If the database cache size seems too small for optimal performance and there is little available memory on the system (check the value of Memory/Available Bytes), adding more memory to the system may increase performance. If there is ample memory on the system and the database cache size is not growing beyond a certain point, the database cache size may be capped at an artificially low limit. Increasing this limit may increase performance.

Maximum value is RAM-2GB (RAM-3GB for servers with sync replication enabled). This and Database Cache Hit % are extremely useful counters for gauging whether a server's performance problems might be resolved by adding more physical memory.

Use this counter along with store private bytes to determine if there are store memory leaks.

MSExchange Database(Information Store)\Database Cache % Hit

Shows the percentage of database file page requests that were fulfilled by the database cache without causing a file operation. If this percentage is too low, the database cache size may be too small.

Should be over 90% for companies with majority online mode clients.

Should be over 99% for companies with majority cached mode clients.

If the hit ratio is less than these numbers, the database cache may be insufficient.

MSExchange Database\Log Bytes Write/sec

Shows the rate bytes are written to the log.

Should be less than 10,000,000 at all times.

With each log file being 1,000,000 bytes in size, 10,000,000 bytes/sec would yield 10 logs/sec. This may indicate a large message being sent or a looping message.

The following counters help determine the user load on the server as well as which protocols are in use.

 

Counter Expected values

MSExchangeIS Client(*)\RPC Operations/sec

Shows what client protocol is performing an excessive amount of RPC Operations/sec.

High IMAP4, POP3, or Outlook Anywhere latency can indicate problems with Client Access servers rather than Mailbox servers. This is especially true when Other Clients (which includes MAPI) latency is lower in comparison.

In some instances, high IMAP latencies could indicate a bottleneck on the Mailbox server in addition to the latencies that the Client Access server is experiencing.

Not applicable.

MSExchangeIS Client (*)\RPC Average Latency

Should be less than 50 ms on average.

Wide disparities between different client types, such as IMAP4, Outlook Anywhere, or Other Clients (MAPI), can help direct troubleshooting to appropriate subcomponents.

MSExchangeIS Client(*)\JET Log Records/sec

Shows the rate that database log records are generated while processing requests for the client.

Used to determine current load.

Not applicable.

MSExchangeIS Client(*)\JET Pages Read/sec

Shows the rate that database pages are read from disk while processing requests for the client.

Used to determine current load.

Not applicable.

MSExchangeIS Client(*)\Directory Access: LDAP Reads/sec

Shows the current rate that the LDAP reads occur while processing requests for the client.

Used to determine the current LDAP read rate per protocol.

Not applicable.

MSExchangeIS Client(*)\Directory Access: LDAP Searches/sec

Shows the current rate that the LDAP searches occur while processing requests for the client.

Used to determine the current LDAP search rate per protocol.

Not applicable.

MSExchangeIS Mailbox(_Total)\Messages Delivered/sec

Shows the rate that messages are delivered to all recipients.

Indicates current message delivery rate to the store.

Not applicable.

MSExchangeIS Mailbox(_Total)\Messages Sent/sec

Shows the rate that messages are sent to transport.

Used to determine current messages sent to transport.

Not applicable.

MSExchangeIS Mailbox(_Total)\Messages Submitted/sec

Shows the rate that messages are submitted by clients.

Used to determine current rate that messages are being submitted by clients.

Not applicable.

MSExchangeIS\User Count

Shows the number of users connected to the information store.

Used to determine current user load.

Not applicable.

MSExchangeIS Public(_Total)\Replication Receive Queue Size

Shows the number of replication messages waiting to be processed.

Should be less than 100 at all times.

This value should return to a minimum value between replication intervals.

These counters indicate issues involving client-initiated operations against Exchange mailboxes or public folders.

 

Counter Expected values

MSExchangeIS Mailbox(*)\Slow Findrow Rate

Shows the rate at which the slower FindRow needs to be used in the mailbox store.

Should be no more than 10 for any specific mailbox store.

Higher values indicate applications are crawling or searching mailboxes, which is affecting server performance. These include desktop search engines, customer relationship management (CRM), or other third-party applications.

MSExchangeIS Mailbox(*)\Search Task Rate

Shows the number of search tasks created per second.

Should be less than 10 at all times.

MSExchangeIS\Slow QP Threads

Shows the number of query processor threads currently running queries that are not optimized.

Should be less than 10 at all times.

MSExchangeIS\Slow Search Threads

Shows the number of search threads currently running queries that are not optimized.

Should be less than 10 at all times.

MSExchangeIS Mailbox(*)\Categorization Count

Shows the categorization count in the number of categorizations that exist in the mailbox store. Categorizations are created when a user creates a filtered view or performs a search. When the information store must maintain an excessive number of categorizations, performance can be affected.

Indicates an overall number of restricted search folders and regular search folders in the system. Sharp increases, especially after implementing any third-party application that takes advantage of MAPI interfaces, should be checked.

Not applicable.

As average mailbox sizes increase, Exchange 2007 must devote more resources to indexing, searching, and retrieving data. These counters will help you to determine whether the server has sufficient resources to perform these queries.

 

Counter Expected values

Process(Microsoft.Exchange.Search.ExSearch)\% Processor time

Shows the amount of processor time that is currently being consumed by the Exchange Search service.

Should be less than 1% of overall CPU typically and not sustained above 5%.

Process(msftefd*)\%Processor Time

Shows the amount of processor time that is being consumed to update content indexing within the store process.

Should be less than 10% of what the store process is during steady state.

Full crawls will increase overall processing time, but should never exceed overall store CPU capacity. Check throttling counters to determine if throttling is occurring due to server performance bottlenecks.

MSExchange Search Indices(*)\Recent Average Latency of RPCs Used to Obtain Content

Shows the average latency, in milliseconds, of the most recent RPCs to the Microsoft Exchange Information Store service. These RPCs are used to get content for the filter daemon for the specified database.

Should coincide with the latencies that Outlook clients are experiencing.

MSExchange Search Indices(*)\Throttling Delay Value

Shows the total time, in milliseconds, a worker thread sleeps before it retrieves a document from the Microsoft Exchange Information Store service. This is set by the throttling monitor thread.

Indicates the current throttling delay value. If this value is non-zero, this indicates a potential server bottleneck causing delay values to be introduced to throttle the rate at which indexing is occurring.

Not applicable.

MSExchange Search Indices(*)\ Average Document Indexing Time

Shows the average, in milliseconds, of how long it takes to index documents.

Should be less than 30 seconds at all time.

MSExchange Search Indices(*)\Full Crawl Mode Status

Indicates whether this .mdb file is going through a full crawl (value=1) or not (value=0).

Used to determine if a full crawl is occurring for any specified database.

If CPU resources are high, it is possible content indexing is occurring for a database or set of databases.

Not applicable.

This section applies to Calendar Attendant, Resource Booking Attendant, Out of Office Assistant, and the managed folder assistant.

Use the counters listed in the following table to determine whether there are any mailbox assistant-related issues.

 

Counter Expected values

Process(MSExchangeMailboxAssistants)\%Processor Time

Shows the amount of processor time that is being consumed by mailbox assistants.

Should be less than 5% of overall CPU capacity.

MSExchange Assistants(*)\Mailboxes Processed/sec

Shows the rate of mailboxes processed by time-based assistants per second.

Determines current load statistics for this counter.

Not applicable.

MSExchange Assistants(*)\Events Polled/sec

Shows the number of events polled per second.

Determines current load statistics for this counter.

Not applicable.

MSExchange Assistants(*)\Events in queue

Shows the number of events in the in-memory queue waiting to be processed by the assistants.

Should be a low value at all times. High values may indicate a performance bottleneck.

MSExchange Assistants(*)\Average Event Processing Time in Seconds

Shows the average processing time of the events chosen.

Should be less than 2 at all times.

The following counters help determine the resource booking load on the server.

 

Counter Expected values

MSExchange Resource Booking\Average ResourceBooking Processing Time

Shows the average time to process an event in the Resource Booking Attendant.

Should be a low value at all times. High values may indicate a performance bottleneck.

MSExchange Resource Booking\Requests Failed

Shows the total number of failures that occurred while the Resource Booking Attendant was processing events.

Should be 0 at all times.

The following counters help determine the Calendar Attendant load on the server.

 

Counter Expected values

MSExchange Calendar Attendant\Average Calendar Attendant Processing time

Shows the average time to process an event in the Calendar Attendant.

Should be a low value at all times. High values may indicate a performance bottleneck.

MSExchange Calendar Attendant\Requests Failed

Shows the total number of failures that occurred while the Calendar Attendant was processing events.

Should be 0 at all times.

These counters are useful in isolating and determining issues involving the interface between the Microsoft Exchange Information Store service on the Mailbox server and Hub Transport servers. Unlike Exchange Server 2003, Exchange 2007 communicates with Hub Transport servers via RPC, not Simple Mail Transfer Protocol (SMTP), and therefore latency and queuing are a greater concern.

 

Counter Expected values

MSExchange Store Interface(_Total)\RPC Latency average (msec)

Shows the average latency, in milliseconds, of RPC requests. The average is calculated over all RPCs since exrpc32 was loaded.

Should be less than 100 ms at all times.

MSExchange Store Interface(_Total)\RPC Requests outstanding

Shows the current number of outstanding RPC requests.

Should be 0 at all times.

MSExchange Store Interface(*)\ROP Requests outstanding

Shows the total number of outstanding remote operations (ROP) requests.

Used for determining current load.

Not applicable.

MSExchange Store Interface(*)\RPC Requests Outstanding

Shows the total number of outstanding RPC requests.

Used for determining current load.

Not applicable.

MSExchange Store Interface(*)\RPC Requests failed (%)

Shows the percentage of failed requests in the total number of RPC requests. Here, failed means the sum of failed with error code plus failed with exception.

Should be 0 at all times.

MSExchange Store Interface(*)\RPC Requests Sent/sec

Shows the current rate of initiated RPC requests per second.

Used for determining current load.

Not applicable.

MSExchange Store Interface(*)\RPC Slow Requests (%)

Shows the percentage of slow RPC requests among all RPC requests.

A slow RPC request is one that has taken more than 500 ms.

Should be less than 1 at all times.

MSExchange Store Interface(*)\RPC Slow Requests latency average (msec)

Shows the average latency, in milliseconds, of slow requests.

Used for determining the average latencies of RPC slow requests.

Not applicable.

MSExchangeMailSubmission(*)\Hub Servers In Retry

Shows the number of Hub Transport servers in retry mode.

Should be 0 at all times.

MSExchangeMailSubmission(*)\Successful Submissions Per Second

Determines current mail submission rate.

Not applicable.

MSExchangeMailSubmission(*)\Failed Submissions Per Second

Should be 0 at all times.

MSExchangeMailSubmission(*)\Temporary Submission Failures/sec

Shows the number of temporary submission failures per second.

Should be 0 at all times.

These counters clearly indicate issues involving the replication engine and replication partners. These issues can be local or remote.

 

Counter Expected values

MSExchange Replication(*)\CopyQueueLength

Shows the number of transaction log files waiting to be copied to the passive copy log file folder. A copy is not considered complete until it has been checked for corruption.

noteNote:
Both nodes of the cluster continuous replication (CCR) clusters should be monitored for this counter depending on the passive node.

Should be less than 10 at all times for CCR.

Should be less than 1 at all times for local continuous replication (LCR).

MSExchange Replication(*)\ReplayQueueLength

Shows the number of transaction log files waiting to be replayed into the passive copy.

noteNote:
Both nodes of the CCR clusters should be monitored for this counter depending on the passive node.

Indicates the current replay queue length. Higher values cause longer store mount times when a handoff, failover, or activation is performed.

MSExchange Replica Seeder(*)\Seeding Finished %

Shows the finished percentage of seeding. Its value is from 0 to 100 percent.

Used to determine if seeding is occurring for a particular database, which is possibly affecting overall server performance or current network bandwidth.

Not applicable.

 
Was this page helpful?
(1500 characters remaining)
Thank you for your feedback

Community Additions

ADD
Show:
© 2014 Microsoft