Understanding Exchange Search

 

Applies to: Exchange Server 2010 SP3, Exchange Server 2010 SP2

With increasing mailbox sizes and increasing amounts of data being stored in mailboxes in the form of messages and attachments, it's crucial for users to be able to quickly search and locate the messages they need. In Microsoft Exchange Server 2010, you can provision personal archives for your users, helping you reduce or eliminate the use of .pst files. This results in more mailbox data being stored by a user, and it makes searching across the user's primary and archive mailboxes an important productivity tool.

In Exchange 2010, authorized users can use Multi-Mailbox Search to perform mailbox searches across the entire Exchange 2010 organization for complying with electronic discovery (eDiscovery) requests, regulatory audits, or internal investigations. Multi-Mailbox Search also uses the content indexes created by Exchange Search.

Exchange Search is different from full-text indexing available in Exchange Server 2003. Improvements were made to performance, content indexing, and search. New items are indexed almost immediately after they're created or delivered to the mailbox, providing users with a fast, stable, and more reliable way of searching mailbox data. In Exchange 2010 and Exchange Server 2007, content indexing is enabled by default on all mailbox databases, and there's no initial setup or configuration required.

Note

Exchange Search doesn't index public folder databases.

Contents

Content Indexing

Exchange Search Performance

Exchange Search Clients

Advanced Query Syntax

Exchange Search and Attachments

Improvements Over Exchange Server 2003 Content Indexing

Difference Between Exchange Search and Exchange Store Search

Exchange Search and Localization

Exchange Search and Database Availability Groups

Content Indexing

When Exchange Search services are started, Exchange Search determines the search status of all mailbox databases on the Mailbox server. If a mailbox database is mounted and enabled for search, Exchange Search assigns it one of following status values:

  • New   When the status of a mailbox database is New, Exchange Search creates a content index catalog for the database. After it's created, Exchange Search changes the status of the database to Crawling.

  • Crawling   When the status of a mailbox is Crawling, Exchange Search indexes mailboxes in the database. The status remains Crawling until all mailboxes in the database have been indexed. After all mailboxes in the databases have been indexed, Exchange Search changes the status of the database to Notification.

  • Notification   After the initial crawl has occurred on a database, Exchange Search is notified by the Exchange store of new events such as message creation, delivery, or deletion. Events are added to the notification queue to be indexed. This happens quickly, so the content index is never more than a few minutes out of date. New messages delivered to a mailbox are indexed within a few seconds of delivery.

Return to top

Exchange Search Performance

Exchange Search offers significant performance improvements compared to full-text indexing in Exchange 2003. The search model has changed from a crawl mode to an always up-to-date mode. Several improvements were made to optimize system resources such as CPU, memory, disk input/output (I/O), and disk space required for indexes. These performance improvements result in over 35x improvement in indexing speed. Although a full crawl of the database is much faster, it can use a significant amount of resources on a Mailbox server, depending on the size of the mailbox database. During more intensive phases, this can disrupt mail flow. Because delivery of mail should take precedence over content indexing, a new throttling feature in Exchange Search automatically throttles indexing for a particular mailbox database or set, reducing disk I/O and CPU utilization.

Return to top

Exchange Search Clients

Exchange Search provides a service that's consumed by search clients. These clients include Microsoft Outlook, Microsoft Office Outlook Web App, Windows Mobile, the Multi-Mailbox Search feature in Exchange 2010, and Exchange Web Services.

In Outlook 2010 and Office Outlook 2007, Outlook profiles used to configure Outlook features on users' computers can be configured to use Cached Exchange Mode. When Outlook is connected to Exchange in an online mode and accesses the Exchange mailbox, changes such as creation of mailbox items, new mail delivery, and deletion of mailbox items takes place on the Mailbox server. In Cached Exchange Mode, Outlook creates a local replica of the Exchange mailbox on the user's computer. This replica is stored in an .ost file in the user's profile. Changes to mailbox items happen in the local replica, which is then synchronized with the Exchange mailbox. For details about Cached Exchange Mode, see About Cached Exchange Mode.

In Cached Exchange Mode, Outlook uses Windows Search, a component built-in in Windows 7 and Windows Vista. Windows Search performs content indexing and provides search functionality to Outlook. Interfacing with a local content indexing and search service provides Outlook users running in Cached Exchange Mode a more efficient way to search their mailbox. In addition to indexing e-mail in the offline store, Windows Search also indexes other data residing in the file system. For details about Windows Search, see Windows Search.

Outlook 2010 and Outlook 2007 provide your users an easily accessible Instant Search box located on top of the message list pane, so that users can quickly search mailbox content. Additionally, using the Advanced Find feature, users can create more complex search queries using a number of fields and parameters.

Return to top

Advanced Query Syntax

With increasing number of e-mail messages received by users, larger mailboxes, and the resulting information overload, the ability to quickly search messages enhances user productivity, and boosts satisfaction with e-mail. Using Advanced Query Syntax (AQS), users can quickly create advanced search queries and find the messages they need. AQS search queries can be entered directly in the Instant Search box in Outlook.

For example, to search messages sent by user April Stewart that have attachments and contain the word Contoso in the subject field, a user can use the following search query: From:"April Stewart" HasAttachments:true Subject:Contoso. To further narrow it to unread messages, the user can add the following keyword and value: unread:true. To further narrow it to messages sent by April last month, the user can add the following keyword and value: Sent:lastmonth.

AQS is supported by both Exchange Search on the server and by Windows Search on the desktop. Search queries using AQS work in Outlook 2010 and Outlook 2007 in online and cached modes. In Exchange 2010, users can also use AQS queries in Outlook Web App and Windows Mobile. Exchange Search clients such as Multi-Mailbox Search also support AQS search queries.

Outlook 2010 and Outlook 2007 support a large number of AQS keywords. Additionally, Exchange Search also supports the keywords shown in the following table.

Exchange Search keywords

Property Example Search results

Attachments

attachment:annualreport.pptx

Messages that have an attachment named annualreport.pptx. The use of attachment:annualreport or attachment:annual* returns the same results as using the full name of the attachment.

Cc

cc:paul shen

cc:pauls

cc:pauls@contoso.com

Messages with Paul Shen in the Cc field.

From

from:bharat suneja

from:bsuneja

from:bsuneja@contoso.com

Messages sent by Bharat Suneja.

Keywords in retention policy

retentionpolicy:business critical

Messages that have the Business Critical retention tag applied.

Date when messages expire according to policy

expires:4/1/2010

Messages that expire on April 1, 2010.

Sent

sent:yesterday

All messages sent yesterday.

Subject

Subject:"patent filing"

All messages where the phrase "patent filing" appears in the Subject field.

To

to:"ben smith"

to:bsmith

to:besmith@contoso.com

Messages that have Ben Smith in the To field.

Return to top

Exchange Search and Attachments

Exchange Search indexes text content contained in e-mail attachments. Support for different file formats is provided using search filters. Exchange Setup installs a number of search filters by default, providing support for indexing many popular file formats, including Microsoft Office files. For a list of search filters installed by Exchange Setup, see Default Filters for Exchange Search. You can install additional search filters for file formats that you want Exchange Search to index. Search filters for different file formats are available from many partners and third parties. The following applies to indexing:

  • Unsearchable items   When Exchange Search can't index a file because a search filter for the file format isn't installed on the Mailbox server, the item is treated as an unsearchable item. An item may also be marked as unsearchable due to other reasons. You can retrieve a list of unsearchable items per mailbox, mailbox database, or mailbox server, using the Get-FailedContentIndexDocuments cmdlet. For details, see Diagnose Exchange Search Issues. You can also include unsearchable items when you perform a discovery search using Multi-Mailbox Search.

  • Safe list   Certain file types are considered to have no content that can be indexed by Exchange Search. These file types are added to a safe list by creating a null filter value in the registry. Exchange Setup creates a null filter registry value for several file types. Mailbox items containing these file types aren't returned in the list of unsearchable items. For a list of default search filters and default null filter entries, see Default Filters for Exchange Search.

  • Encrypted items   Messages encrypted using S/MIME aren't indexed by Exchange Search. Encrypted messages are returned as unsearchable items if you use the Get-FailedContentIndexDocuments cmdlet.

  • IRM-protected items   Messages protected using Information Rights Management (IRM) are indexed by Exchange Search and included in search results. Messages must be protected by using an Active Directory Rights Management Services (AD RMS) server in the same Active Directory forest as the Exchange 2010 Mailbox server. For details, see Information Rights Management.

Note

In Cached Exchange Mode, attachments are also indexed by Windows Search. Windows Search uses search filters installed on the user's computer.

Return to top

Improvements Over Exchange Server 2003 Content Indexing

The search functionality in Exchange 2003 (content indexing) is replaced with Exchange Search in Exchange 2010. Exchange Search provides the following feature and functionality improvements over content indexing:

  • Utilization of system resources such as CPU, memory, disk I/O, and disk space required for its indexes is improved, which significantly increases overall performance.

  • New messages are typically indexed within 10 seconds of arrival, and query results are returned within seconds.

  • Exchange Search is automatically enabled upon installation and doesn't require any configuration.

  • Attachments can now be indexed. Several attachment types are supported, including Microsoft Office documents, text attachments, and HTML attachments.

  • Indexing is automatically withheld for a specific mailbox database, which reduces the disk I/O load. Also, indexing is automatically withheld for the entire Mailbox server, which reduces both disk I/O and CPU utilization for Exchange Search.

  • There is an easily accessible search bar in Outlook Web App and query builder support in Outlook 2010 and Outlook 2007.

Return to top

Exchange Search allows you to quickly search text in messages through the use of pre-built indexes. Exchange store search is based on a sequential scan of all the messages in the search scope instead of using the pre-built indexes The following table compares some of the differences between Exchange Search and Exchange store search.

Exchange Search Exchange store search

Faster

Slower

Searches the content index created by crawling the mailbox database

Searches the store

Indexes new items within seconds of creation or delivery to a mailbox

May not return newer items

Uses words, phrases, and sentences, ignores punctuation and spaces, not case-sensitive

Searches stream of bytes, finds only exact matches

Supports only prefix searches, doesn't support substring matches

Supports substring matches

Searches attachments using available search filters

Doesn't search within attachments

Can search messages in different languages

Not language-aware

Return to top

Exchange Search and Localization

Localization support for Exchange Search is limited to scenarios in which the client locale matches the message locale (which must also match the language used in the message body). Exchange Search doesn't support instances where a single message has multiple languages embedded in the body or where the client locale is different from the message locale.

To get consistent results for localized searches, the following must be true:

  • An e-mail message must be written in a single language and that language must match the locale of the message.

  • The search expression must be in a single language.

  • The language must match the locale of the client computer, as identified by the connection to the server.

Return to top

Exchange Search and Database Availability Groups

In organizations that have a database availability group (DAG), during the seeding process, DAG members with a passive mailbox database copy replicate the content index catalog from the DAG member that has the active mailbox database copy. The content index is typically 10 percent the size of the mailbox database. After initial seeding, the server with the passive database copy gets message data from the server with the active database and performs content indexing locally. The bandwidth used for copying message content for indexing is in addition to the bandwidth used for replication of transaction logs. When planning a high availability deployment, you must consider the bandwidth used by Exchange Search.

The Exchange 2010 Mailbox Server Role Requirements Calculator includes content indexing considerations when calculating the bandwidth required for content indexing in a DAG. For more information about the calculator, including a link to download the calculator, see the Exchange Server Team Blog article Exchange 2010 Mailbox Server Role Requirements Calculator.

To learn more about DAGs, see Understanding Database Availability Groups.

 © 2010 Microsoft Corporation. All rights reserved.