Making Searches Effective in Custom Applications

 

The following sections give recommendations for techniques to use if you are creating custom client applications. These recommendations include which SQL structures to use and which to avoid, and how to increase the efficiency of searches requested using HTTP/DAV.

For more detailed information about using search requests in custom applications, see the topic "Search" in the Exchange Server 2003 SDK.

Avoid Using SELECT * SQL Statements

Whenever possible, use SQL expressions such as SELECT <required attribute> instead of SELECT *. When Exchange Server 2003 executes a search using SELECT *, Exchange Server 2003 checks the schema of the target item to identify the attribute set to return.

Checking the schema increases the processing cost of the request and accounts for most of the cost required to process the search request.

By using a SELECT <required attribute> statement, you avoid the schema check, and the request returns only the attributes required for the application. This allows later processing to be performed more efficiently.

Avoid Using a Where Clause in SQL Statements

Instead of including a WHERE clause in an SQL expression, use a different technique to narrow the range of the search. Removing the WHERE clause reduces the processing cost of the search request, and reduces the increase in the size of the transaction logs. For example, in a case in which the WHERE clause is used to obtain only unread messages, you can improve the processing efficiency by first obtaining both read and unread messages and then removing only the read messages.

However, be careful where you use this method. This method is not as effective for searches in which the result set size changes greatly according to whether the WHERE clause is included.

Use PropFind for HTTP/DAV Searches

If you search using HTTP/DAV Search statements or ADO SQL statements, you may be able to use the HTTP/DAV PropFind method to conduct the searches. When using PropFind, you can move some of the search processing load off of the Exchange Server 2003 mailbox or public folder server, and reduce the search impact on the size of the transaction logs.

If you decide to use PropFind, especially if you are modifying an existing application, test this solution thoroughly to be sure that the results and performance impact are what you expect.

The PropFind method has the following advantages:

  • Increase in the size of the transaction logs can be suppressed.

  • Load can be distributed easily between the Exchange servers and other application servers. The load on the Exchange servers is limited to the PropFind request itself, while the application servers perform related tasks such as authenticating users, creating views, and processing XML result sets.

However, the PropFind method also has the following disadvantages:

  • There is no equivalent of the SQL WHERE clause. You will need to implement an equivalent function when processing the XML result set. For details, see "Example XML Code for a PropFind Search" later in this chapter.

  • The network load may increase. If a large amount of data flows between the Exchange server and the application server, you may need to make sure that these servers are connected by a high-capacity network.

  • This method is difficult to incorporate when the application has already been constructed.

The following example XML code shows how you can use PropFind. Rather than using a WHERE clause, you can extract unread messages from the result set using XSL.

Note

This example code does not include all DAV headers.

<?xml version='1.0'?>
<d:propfind xmlns:d='DAV:'
            xmlns:h='urn:schemas:httpmail:'
            xmlns:e='https://schemas.microsoft.com/exchange/'>
   <d:prop>
      <d:href />
      <d:displayname />
      <h:subject />
      <h:sendername />
      <d:creationdate />
      <h:read />
   </d:prop>
</d:propfind>

Performance Impact of the Search Scope

Searches can have either a shallow scope (also referred to as shallow traversal) or a deep scope (also referred to as deep traversal). Shallow refers to a search whose scope is limited to a single folder or a set of specific folders. Deep refers to a search whose scope includes a folder and all its subfolders.

The depth of your folder hierarchy and the size of the folders affect whether deep traversal searches perform better than multiple shallow traversal searches. When you run a deep traversal search, Exchange Server 2003 must lock the folder hierarchy to prevent it from changing during the search. This constraint may impact other operations that need access to the folders. However, the processing time required for multiple shallow traversal searches is slightly longer than that required for deep traversal searches because Exchange Server 2003 uses Active Server Pages (ASP) to create sequential SQL statements. To optimize search performance, especially if you are building a custom application, use a test topology to simulate the complexity of your folder hierarchy and the usage load that you anticipate. Use the test results to determine what type of search works best for your situation.

For both deep traversal searches and shallow traversal searches, processing time increases with the number of folders searched.

Note

You can use a third type of scope, a hierarchical traversal, instead of a deep traversal if you only need attributes of a folder and its subfolders. Because it only returns attributes, a hierarchical traversal search completes faster than an equivalent deep traversal search.

You can also improve search performance if you use your custom application to explicitly create search folders in addition to specifying the search scope. If an explicitly created folder exists for a search request, Exchange Server 2003 does not have to create the folder while processing the search. This approach can be especially effective for narrowing the search result set for a complex search request.