Estimate performance and capacity requirements for Microsoft Search Server 2010 Express

 

Applies to: Search Server 2010

Topic Last Modified: 2011-08-05

This article contains capacity planning information for Microsoft Search Server 2010 Express.

For general information about how to plan and run your capacity planning for Microsoft SharePoint Server 2010, see Capacity management and sizing for SharePoint Server 2010.

In this article:

  • Introduction

  • Search life cycle

  • Test farm characteristics

  • Health and performance data

  • Test results

  • Recommendations

  • Optimizations

  • Troubleshooting performance and scalability

Introduction

This article describes capacity planning information for a collaboration environment deployment of Search Server 2010 Express and includes the following specifications that were used during the performance and capacity testing of Search Server 2010 Express:

  • Test environment specifications, such as hardware, farm topology, and configuration

  • The workload used for data generation that includes the number and class of users, and farm usage characteristics

  • Test farm dataset, which includes database contents, search indexes and external data sources

  • Health and performance data that is specific to the tested environment

This article also contains common test data and recommendations for how to determine the hardware, topology and configuration that you need to deploy a similar environment, and how to optimize the environment for appropriate capacity and performance characteristics.

Search Server 2010 Express contains a richer set of features than earlier versions. Before you use this architecture to deliver more powerful features and functionality to users, carefully consider the effect of the architecture on the farm’s capacity and performance.

This article describes how to do the following:

  • Define performance and capacity targets for the environment.

  • Plan the hardware that is required to support the number of users, and the features that you intend to deploy.

  • Design the physical and logical topology for optimal reliability and efficiency.

  • Test, validate, and scale up or scale out your environment to achieve performance and capacity targets.

  • Monitor the environment for key indicators.

Before you read this article, we recommend that you read the following articles:

Search life cycle

This article helps you estimate capacity at an early stage of the farm. Farms move through the following stages of the search life cycle as content is crawled:

Index acquisition   This is the first stage of data population. The duration of this stage depends on the size of the content sources. It is characterized by the following:

  • Full crawls (possibly concurrent) of content.

  • Close monitoring of the crawl system to ensure that hosts that are being crawled are not a bottleneck for the crawl.

  • Frequent master merges that, for each query component, are triggered when some the index has changed.

Index maintenance   This is the most common stage of a farm. It is characterized by the following:

  • There are incremental crawls of all content.

  • For SharePoint Server content crawls, most of the changes that are encountered during the crawl are security changes.

  • There are infrequent master merges that, for each query component, are triggered when some the index has changed.

Index cleanup   This stage occurs when a content change moves the farm out of the index maintenance stage — for example, when a content database or site is moved from one Search service application to another. This stage is triggered when either of the following events occurs:

  • A content source or start address is deleted from a Search service application.

  • A host that is supplying content is not found by the search crawler for a long time.

Test farm characteristics

This section describes the configuration that was used for the test. It also includes the workload, dataset, performance data, and test data for the environment.

This configuration uses a typical search deployment, with the following characteristics:

  • Search is provided for a small amount of content. The initial 340,000 items in the index is typical for a small corpus. The second set of tests in the Test Results section was run at 500,000 items. Adding more items to this farm was not possible due to the 4 GB database size limit imposed by SQL Server Express. We would have to upgrade our SQL Server edition in order to add more items to this farm, up to the item limit.

  • Search can be run on only one server in the farm at a time. Scaling out is not supported.

  • The large majority of the user queries can be found in the same 33 percent of the index. This means that most users query for the same terms.

  • The default metadata settings are used. This makes sure that the property databases do not grow at a large rate.

  • Measurements taken on these farms may vary because of network and environmental conditions. We can expect up to a 10 percent margin of error.

  • The farm is not run as an evaluation. Because 16 gigabytes (GB) of RAM exists on the server, this configuration is meant for production use. Subsequently, the PerformanceLevel of the farm is set to PartlyReduced by using the following Windows PowerShell command:

    Get-SPEnterpriseSearchService | Set-SPEnterpriseSearchService -PerformanceLevel "PartlyReduced"
    

Dataset

This section describes the test farm dataset. It includes database contents and sizes, search indexes, and external data sources.

Object Value

Search index size (number of items)

334,467

Size of crawl database

1.97 GB

Size of crawl database log file

.47 GB

Size of property database

2.54 GB

Size of property database log file

.149 GB

Size of search administration database

.076 GB

Size of active index partitions

1.98 GB

Total number of search databases

4

Other databases

SharePoint Config; SharePoint_AdminContent; State_Service; Bdc_Service_db; WSS_UsageApplication; WSS_Content

Hardware settings and topology

This section provides detailed information about the hardware, software, topology, and configuration of the test environment.

Lab hardware and topology

This section describes the hardware that was used for testing.

Important

Because the farm was running pre-release versions of Search Server 2010 Express, the hardware that was used has more capacity than is required under more typical circumstances.

Application servers

Server Combined

Processor

1px4c@ 2.5 GHz

RAM

16 GB

Operating system

Windows Server 2008 R2, 64-bit

Storage

2 x 147 GB 15K SAS: RAID1:OS

2 x 147 GB 15K SAS: RAID1:SQL Data, Index

2 x 147 GB 15K SAS: RAID1:SQL Logs

Number of NICs

1

NIC speed

1 gigabit

Authentication

NTLM

Load balancer type

none

Software version

Search Server 2010 Express (pre-release version)

Services running locally

All services. This includes SQL Server Express

Topology

The test farm was a single server farm, with one server acting as the Web and application server and the database server.

Test topology for Search Server 2010 Express

Workload

This section describes the workload that was used for data generation including the number of users and farm usage characteristics.

Workload characteristics Value

High-level description of workload

Search farms

Average queries per minute

6

Average concurrent users

1

Total number of queries per day

8,640

Timer job load

Search Health Monitoring – Trace Events; Crawl Log Report; Health Analysis Job; Search and Process

Health and performance data

This section contains health and performance data that is specific to the test environment.

Query performance data

The following measurements were taken with 340,000 items in the index. The columns give the measurements that were taken during the specific test, and the results are at the bottom of the following table.

Scorecard metric Query testing

CPU metrics

Average front-end Web server, query server, indexer CPU

99.065%

Reliability

Failure rate

Not performed

Crashes

Not performed

Perf counters

Cache hit ratio (SQL Server)

99.97%

SQL Server locks: average wait time (ms)

0

SQL Server locks: lock wait time (ms)

0

SQL Server locks: deadlocks/sec

0

SQL Server latches: average wait time (ms)

2.983

SQL Server compilations/sec

64.192

SQL Server Statistics: SQL Server recompilations/sec

.032

Average disk queue length (indexer, front-end Web server, query server)

.23

Disk Queue Length: Writes (indexer, front-end Web server, query server)

.013

Disk Reads/sec (indexer, front-end Web server, query server)

12.93

Disk Writes/sec (indexer, front-end Web server, query server)

108.7

Average memory used (indexer, front-end Web server, query server)

22.182%

Max memory used (indexer, front-end Web server, query server)

22.325%

Test results

Number of Errors

0

Query UI latency (75th percentile)

0.256 sec

Query UI latency (95th percentile)

0.303 sec

Query throughput

14.743 requests/sec

Crawl performance data

The following measurements were taken during full crawls of the given content source. The content source size is given in millions of items. The columns give the measurements that were taken during the specific crawl, and the results are at the bottom of the table.

Scorecard metric SharePoint Foundation crawl test to reach 140,000 index size

CPU metrics

Average front-end Web server, query server, indexer CPU

56.91%

Reliability

Failure rate

Not performed

Crashes

Not performed

Perf counters

Cache Hit Ratio (SQL Server)

98.57%

SQL Server locks: average wait time (ms)

146.75

SQL Server locks: lock wait time (ms)

201.85

SQL Server locks: deadlocks/sec

0.002

SQL Server latches: average wait time (ms)

13.747

SQL Server compilations/sec

8.161

SQL Server Statistics: SQL re-compilations/sec

0.253

Average disk queue length (indexer, front-end Web server, query server)

9.752

Disk queue length: Writes (indexer, front-end Web server, query server)

3.458

Disk reads/sec (indexer, front-end Web server, query server)

362.94

Disk writes/sec (indexer, front-end Web server, query server)

184.37

Average memory used (indexer, front-end Web server, query server)

27.37

Max memory used (indexer, front-end Web server, query server)

28.62

Scorecard metric File share (100,000 items) HTTP (100,000 items) SharePoint Foundation (140,000 items)

Test results

Number of successes

100,105

100,000

13,8595

Number of errors

79

0

12

Portal crawl speed (items/sec)

65.557

53.419

88.165

Anchor crawl speed (items/sec)

770.031

769.231

1108.76

Total crawl speed (items/sec)

60.413

49.95

81.671

Test data

This section contains test data showing how the farm performed under load at different index sizes.

Query latency

The following graph displays the query latency percentiles for this farm as user load increases. A query percentile of 95 percent means that 95 percent of the query latencies measured were lower than that value.

Query latency for farm as user load increases

From this graph, you can see that there is little difference at light loads between 340,000 items and 500,000 items in the index.

Query throughput

The following graph shows the query throughput for this farm as user load increases (collected during the query throughput test).

Query throughput for farm as user load increases

This graph shows that when user load is added beyond approximately eight concurrent users, this farm will achieve no additional throughput, and latency will increase.

Crawl rate

The following graph displays the crawl rate for this farm during the index acquisition stage of the search life cycle at different existing index sizes. The values represent crawl rates from full crawls of the same content sources, in items crawled per second. The two measurements are taken with the same farm. The only difference is that for the second run the PerformanceLevel was changed from Reduced to PartlyReduced by using the following Windows PowerShell cmdlets:

Get-SPEnterpriseSearchService | Set-SPEnterpriseSearchService -PerformanceLevel "PartlyReduced"

Crawl rate during the index acquisition stage

By default, PerformanceLevel is set to Reduced in Search Server 2010 Express to satisfy the evaluation scenario where it is likely running on a server that has minimal resources that are available. This graph shows that setting the PerformanceLevel to PartlyReduced resulted in a large performance improvement because the server had the recommended resources that are available.

Test results summary

During the query tests, the CPU was near capacity on the query servers. Adding more CPU cores to the server is the first step in improving performance. Also, the property database hit the 4 GB database limit of SQL Server Express at over 500,000 items in the index. Moving to another edition of SQL Server that does not have the 4 GB limit would be the first step in scaling up the farm to the eventual 10 million item limit. Subsequent steps would include scaling up the hardware (for example, more processors), more RAM, and faster storage.

Recommendations

This section contains recommendations for how to determine the hardware, topology, and configuration that you have to deploy an environment that resembles this scenario and how to optimize the environment for appropriate capacity and performance characteristics.

Hardware recommendations

For specific information about overall minimum and recommended system requirements, see Hardware and software requirements (SharePoint Foundation 2010). Note that requirements for servers that were used for search supersede the overall system requirements. Use the following recommended guidelines for RAM, processor, and IOPS to meet performance goals.

Search sizing

This section explains the search system, including sizing requirements and guidelines, per component.

Search query system

This section describes the components of the search query system for a given Search service application. The sizing requirements for each component appear in the Scaling details table later in this section.

Components of the search query system

The following list defines the search query system objects that are in the previous diagram:

  • Search proxy   This is the Search service application proxy that is installed on any farm that consumes search from this Search service application. It runs in the context of the Web applications that are associated with the Search service application proxy.

  • Search Query and Site Settings Service   This is also known as the query processor. Receiving the query from a Search service application proxy connection, a query processor does the following:

    • Sends the query to one active query component for each partition (or to the property database or both, depending on the query).

    • Retrieves Best Bets and removes duplicates to get the results set.

    • Security-trims the results based on security descriptors in the search administration database.

    • Retrieves metadata of the final results set from the property database.

    • Sends the query results back to the proxy.

  • Index partition   A logical concept that represents the full-text index. The query component contains the index.

  • Search query component   A query component contains the full-text index. When queried by a query processor, the query component determines the best results from its index and returns those items.

  • Search administration database   Created at the same time as the Search service application, the search administration database contains the Search service application-wide data that is used for queries like Best Bets and security descriptors in addition to application settings that are used for administration.

  • Property database   A property database contains the metadata (title, author, related fields) for the items in the index. The property database is used for property-based queries in addition to retrieving metadata that is needed for displaying the final results.

Scaling details

Object Scale considerations RAM IOPS (read/write)

Search proxy

This scales with the front-end Web servers on which it is associated.

N/A

N/A

Search Query and Site Settings Service

This service, which is installed in the Services on Server page in Central Administration, should be started on the server. If a custom security trimmer is used, it might affect CPU and RAM resources. For more information, see Using a Custom Security Trimmer for Search Results (https://go.microsoft.com/fwlink/p/?LinkId=223861).

This uses RAM (process cache) for caching security descriptors for the index.

N/A

Index partition

N/A

N/A

N/A

Query component

The query component on a server consumes memory when serving queries. It consumes IO during a crawl.

A query component should have 33 percent of its index in RAM (OS cache).

1 K needed to do the following:

  • Load the index into RAM for queries.

  • Write index fragments received from each crawl component.

  • Merge index fragments into the query component index, such as during a master merge.

Search administration database

For each query, Best Bets and security descriptors are loaded from the search administration database. Ensure the database server has enough RAM to serve Best Bets and security descriptors from cache.

Ensure that the database server has enough RAM to keep the important table (MSSSecurityDescriptors) in RAM.

700

Property database

For each query, metadata is retrieved from the property database for the document results, so that you can add RAM to the database server to improve performance.

Ensure that the database server has enough RAM to keep 33% of the important tables (MSSDocSDIDs + MSSDocProps + MSSDocresults) in cache.

2 K

30% read, 70% write

Search crawl system

This section describes the components of the search crawl system and shows the sizing requirements of each component.

Components of the search crawl system

The following defines the search crawl system objects in the previous diagram:

  • Administration component   An administration component is used when a crawl is started, in addition to when an administration task is performed on the crawl system.

  • Crawl component   A crawl component processes crawls, propagates the resulting index fragment files to query components, and adds information about the location and crawl schedule for content sources to its associated crawl database.

  • Search Administration database   The search administration database, which is created at the same time as the Search service application, stores the security descriptors that are discovered during the crawl, in addition to the application settings that are used for administration.

  • Crawl database   A crawl database contains data that is related to the location of content sources, crawl schedules, and other information that is specific to crawl operations. They can be dedicated to specific hosts by creating host distribution rules. A crawl database only stores data. The crawl component that is associated with the given crawl database does the crawling.

  • Search query system   Returns results for a query. For more information, see Search query system.

Scaling details

Object Scale considerations RAM IOPS (read/write)

Administration component

The single administration component is not scalable.

Minimal

Minimal

Crawl component

Crawl components use CPU bandwidth aggressively. Optimally, a given crawl component can use four CPU cores. RAM is not as important.

Moderate

Note

Note that when crawling East Asian documents, RAM requirements will increase due to the word breakers.

300-400

Search administration database

For each query, Best Bets and security descriptors are loaded from the search administration database. Ensure the database server has enough RAM to serve Best Bets and security descriptors from cache.

Ensure the database server has enough RAM to keep the important table (MSSSecurityDescriptors) in RAM.

700

Crawl database

A crawl database aggressively uses IO bandwidth. RAM is not as important. A crawl database needs 3.5 K IOPS for crawling activities; it will consume as much as 6 K IOPS, based on the available bandwidth.

Moderate

3.5 K – 6 K

73% read, 27% write

Software limits

The following table shows software boundaries that are imposed to support an acceptable search experience.

Object Limit Notes

SharePoint Search Service Application

Does not apply to Search Server 2010 Express configuration.

N/A

Indexed documents

Using SQL Server Express, this is limited to approximately 500,000 items. Otherwise, the overall recommended maximum is 10 million items.

SQL Server Express imposes a 4 GB database size limit, which translates to about 500,000 items in the search corpus. Moving to a different SQL Server edition would eliminate this limitation.

Index partition

Does not apply to Search Server 2010 Express.

Property database

Does not apply to Search Server 2010 Express.

Crawl databases

Does not apply to Search Server 2010 Express.

Crawl components

Does not apply to Search Server 2010 Express.

Query components

Does not apply to Search Server 2010 Express.

Concurrent crawls

Recommended limit is 2 per Search service application.

The number of crawls underway at the same time. Crawls are very expensive search tasks which can affect database and other application load; exceeding 2 concurrent crawls may cause the overall crawl rate to decrease.

Content sources

Recommended limit of 50 content sources.

The recommended limit can be exceeded up to the hard limit of 500 per search server application. However, fewer start addresses should be used, and the concurrent crawl limit must be followed.

Start addresses

Recommended limit of 100 start addresses per content source.

The recommended limit can be exceeded up to the hard limit of 500 start addresses per content source. However, fewer content sources should be used. A better approach when you have many start address is to add them as links on an HTML page, and then have the HTTP crawler crawl the page and follow the links.

Crawl rules

Recommended limit of 100

The recommendation can be exceeded. However, display of the crawl rules in search administration is decreased.

Crawl logs

Recommended limit of 10 million per application

The number of individual log entries in the crawl log. It will follow the indexed documents limit.

Metadata properties recognized per item

The hard limit is 10,000.

The number of metadata properties that, when an item is crawled, can be determined, and potentially mapped and used for queries.

Crawled properties

500,000

Properties that are discovered during a crawl.

Managed properties

100,000

Properties used by the search system in queries. Crawled properties are mapped to managed properties. We recommend a maximum of 100 mappings per managed property. Exceeding this limit may decrease crawl speed and query performance.

Scopes

Recommended limit of 200 per site

This is a recommended limit per site. Exceeding this limit may decrease the crawl efficiency and effect end-user browser latency if the scopes are added to the display group. Also, display of the scopes in search administration decreases as the number of scopes increases past the recommended limit.

Display groups

25 per site

These are used for a grouped display of scopes through the user interface. Exceeding this limit starts to degrade the search administration scope experience.

Scope Rules

Recommended limit is 100 scope rules per scope, and a total of 600 per Search service application.

Exceeding this limit will decrease crawl freshness and delay potential results from scoped queries.

Keywords

Recommended limit of 200 per site collection.

The recommended limit can be exceeded up to the maximum (ASP.NET-imposed) limit of 5,000 per site collection with five Best Bets per keyword. Display of keywords on the site administration user interface will decrease. The ASP.NET-imposed limit can be modified by editing the Web.Config and Client.Config files (MaxItemsInObjectGraph).

Authoritative pages

Recommended limit of one top-level authoritative page, and as few as possible second and third level pages to achieve desired relevance.

The hard limit is 200 per relevance level per search service application. However, adding additional pages may not achieve the desired relevance. Add the key site to the first relevance level. Add later key sites to either second or third relevance levels, one at a time, evaluating relevance after each addition to ensure that the desired relevance effect is achieved.

Alerts

Recommended limit of 1,000,000 alerts per Search service application

Results removal

100 URLS in one operation

This is the maximum recommended number of URLs that should be removed from the system in one operation.

Optimizations

This section describes methods for improving farm performance.

Search crawl system optimizations

In general, you perform search crawl optimizations when users complain about results that should be there but are not, or are there but are stale.

When you try to crawl the content source start address within freshness goals, you can experience the following crawl performance issues:

  • Crawl rate is low because of IOPS bottlenecks in the search crawl subsystem.

  • Crawl rate is low because of a lack of CPU threads in the search crawl subsystem.

  • Crawl rate is low because of slow repository responsiveness.

Each of these issues assumes that the crawl rate is low. See Use search administration reports (SharePoint Server 2010) (given the software life cycle phases) to establish a baseline for the typical crawl rate for the system over time. The following sections describe ways of addressing crawl performance issues when this baseline regresses.

Crawl IOPS bottleneck

After you verify that a crawl or property database is a bottleneck, you have to scale out the crawl system to address it using the appropriate resolutions. Always check the crawl database to make sure that it is not the bottleneck. If crawl database IOPS are already bottlenecked, increasing the number of threads does not help.

Crawl CPU thread bottleneck

If you have many hosts and have no other crawl bottlenecks, you have to scale up or scale out the crawl system by using the appropriate resolutions. The crawler can accommodate a maximum of 256 threads per Search service application. We recommend having a quad-core processor to realize the full benefit of the maximum number of threads. When it is conclusively determined that the repository is serving data fast enough (see Crawl bottleneck on repository), the crawl throughput can be increased by requesting data faster from the repository by increasing the number of crawler threads. This can be achieved in the following two ways:

  • Change the indexer performance level to Partially Reduced or Maximum by using the following Windows PowerShell command. Use the Maximum value if you are using a processor with fewer than four cores.

    Get-SPEnterpriseSearchService | Set-SPEnterpriseSearchService -PerformanceLevel "Maximum"
    
  • Use crawler impact rules to increase the number of threads per host. This should consider that the system supports a maximum of 256 threads, and assigning many threads to few hosts might result in slower data retrieval from other repositories.

Crawl bottleneck on repository

Sometimes, when a SharePoint Web application that has many nested site collections or remote file shares is being crawled, the search crawler might be bottlenecked on the repository. A repository bottleneck can be identified if the following two conditions are true:

  • There is a low (less than 20 percent) CPU usage on the server that is running Search Server Express.

  • There are many threads, most in the worst case, waiting for the network.

    This condition is identified by looking at OSS Search Gatherer/Threads Accessing Network performance counter.

What this situation represents is that the threads are blocked while they wait for data from the repository. In an environment that has multiple content sources, it might be useful to determine the host whose responsiveness is slow by pausing all other crawls, and then performing a crawl using the content source that has the suspected host as one of its start addresses.

When a problematic host is identified, you must investigate the cause of the slow response times. For SharePoint 2010 Products content in particular, see Divisional portal environment lab study (SharePoint Server 2010), especially information about how to add front-end Web servers which search can use for crawling.

The crawl throughput can be significantly improved by performance tuning the crawled data repositories.

Troubleshooting performance and scalability

This section contains information about how to analyze and troubleshooting query and crawl performance issues, and describes the cause and resolution of common bottlenecks.

Troubleshooting query performance issues

SharePoint Server has an instrumented query pipeline and associated administration reports that can help you troubleshoot server-based query performance issues. For more information, see Use search administration reports (SharePoint Server 2010). This section shows reports and then uses them to help understand how to troubleshoot issues on the server. In addition, this section also contains tools and guidance that is available to help in addressing client-based (browser) performance issues.

Server-based query issues

Server-based query performance issues can be segregated into the following two levels:

  • Search front-end performance issues

  • Search back-end performance issues

The following two subsections give the details for troubleshooting each of them. Note that these are high-level guidelines.

Front-end performance issues

The first step in troubleshooting front-end query performance is to review the Overall Query Latency search administration report. The following is an example report:

Search admin report of overall query latency

In this report, front-end performance is represented by the following data series:

Server Rendering

This value represents, for the given minute, the average time that is spent per query in the various search Web Parts in the front-end Web server.

Object Model

This value represents, for the given minute, the average time that is spent in communication between the front-end Web server and the search back-end.

Troubleshooting server rendering issues

Server rendering issues can be affected by anything that occurs on the front-end Web server that serves the Search Center results page. In general, you want to understand how much time that is required to retrieve the various Web Parts in order to find where the extra latency is being added. Enable the Developer Dashboard on the search results page for detailed latency reporting. For more information, see SharePoint 2010 Logging Improvements – Part 2 (Introducing Developer Dashboard) (https://go.microsoft.com/fwlink/p/?LinkId=223862). Common issues that manifest as excess server rendering latency include the following:

  • Platform issues such as the following:

    • Slow Active Directory Domain Services (AD DS) lookup

    • Slow SQL Server times

    • Slow requests for fetching the user preferences

    • Slow calls to get the user token from secure token service

  • Code-behind issues such as modified search results pages (such as results.aspx) that are checked in but not published.

Troubleshooting object model issues

The object model can be affected by the following issues:

  • Issues with the Windows Communication Foundation (WCF) layer: timeouts and threadabortexception in WCF calls in the deployment.

  • Issues with communication between the content and service farms (if it is configured).

Back-end performance issues

The first step in troubleshooting back-end query performance should be reviewing the SharePoint Backend Query Latency search administration report. The following is an example report:

Admin report of database server query latency

In this report, back-end performance is represented by the following data series (each is average time that is spent per query, in the given minute), grouped by functional component.

Back-end component Data series Description

Query component

Full-text Query

The average time to query the full-text index for results.

Property database

Multiple Results Retrieval

The average time to retrieve document metadata, such as title or author, to appear in the query results.

Property Store Query

The average time to query the property database for property-based queries.

Search Administration database

Best Bets

The average time to determine whether there are Best Bets available for the query terms.

High Confidence Results

The average time to retrieve high confidence results for queries.

Query processor

Security Trimming

The average time to remove items the user does not have access to.

Duplicate Removal

The average time to remove duplicates.

Results Population

The average time to create the in memory table to be passed back to the object model.

Troubleshooting query component performance issues

Query components are resource intensive, especially when the component is “active”; that is, when the component responds to query requests. Troubleshooting query component performance is one of the more complex search areas. The following are general areas to consider:

  • The most resource intensive query component event is the master merge, where shadow indexes are merged with the master index. This event occurs independently for each query component, although usually at the same time across all query components. An example of the effect of the master merger is shown in the SharePoint Backend Query Latency report, at times before 1:30 PM. If this event is affecting end-user query latency, you might define “blackout” periods where a master merge event is avoided unless the percentage of change exceeds the defined limit.

  • Sustained high values for the environment mean that you should probably do the following:

    • Examine the index size for each component on the server. Ensure enough RAM on the server exists to let approximately 33 percent of the sum of index sizes to be cached.

    • Examine the query component IO channel on the server. Ensure you are not experiencing an IO bottleneck.

Troubleshooting property database issues

Examine SQL Server health by using concepts in the following article: Storage and SQL Server capacity planning and configuration (SharePoint Server 2010). If you are executing custom queries, you may have to set a QueryHint parameter on the query, to guide the correct query plan.

Troubleshooting search administration database issues

Examine SQL Server health by using concepts in the following article: Storage and SQL Server capacity planning and configuration (SharePoint Server 2010).

Troubleshooting query processor issues

Troubleshoot query processor issues by examining the area of the query processor that is exhibiting the decreased query latency:

  • Security trimming

    • For Windows claims, examine the Active Directory connection from the server hosting the query processor.

    • For all cases, the cache size that is used by the query processor can be adjusted increased, if there is a correlation between many round trips to SQL Server (determined by SQL Profiler). More than 25 percent of queries should not need a SQL Server call to retrieve security descriptors from the search administration database. If they do, adjust the query processor cache size.

  • Duplicate removal:

    • Look at whether you are crawling the same content in multiple locations. Disable duplicate detection in the Search Center.
  • Multiple results retrieval:

Browser-based query issues

Users can be either delighted or exasperated by the speed of search results. Page load time (PLT) is one of the important factors in users’ satisfaction with the search experience. Most of the focus around page-load time is on the server-side, specifically the time that is required for the server to return results. However, client-side rendering can make up a significant part of page-load time and is important to consider.

The search user experience provides sub-second responses for total page-load time. Of that time, client rendering typically takes less than 280 milliseconds, depending on the browser and rendering measurement. This experience gives very fast results.

Customizations to the results experience can easily decrease rendering performance. Search administrators and developers must be vigilant in measuring the rendering time after each modification to ensure performance has not regressed significantly. Every addition to the page, such as a new Web Part or a new cascading style sheets (CSS) style, will increase rendering time on the browser and delay results for the users. However, the amount of delay can vary greatly based on whether you follow best practices when you customize the page.

The following are general guidelines for maintaining browser performance:

  • Basic branding and style customizations to the page should not add more than approximately 25 ms to page-load time. Measure page-load time before and after you implement customizations to observe the change.

  • Users typically notice a change (faster or slower) in an increment of 20 percent. Consider this when you make changes; 20 percent of the standard rendering time is only 50 ms. (Source: Designing and Engineering Time (https://go.microsoft.com/fwlink/p/?LinkId=223864).)

  • Cascading style sheets and JScript are the most common and largest causes of high rendering performance. If you must have customized cascading style sheets and JScript, you should ensure that they are minimized to one file each.

  • JScript can load on-demand after the page renders to give the user visible results sooner.

  • The more customizations that are added to the page, the slower it will load. Consider whether the added functionality and style is worth the additional delay in returning of results for users.

In addition to these guidelines, there is lots of information about the Internet about how to reduce page-load time and about the effect of slow pages on a user experience.

Troubleshooting crawl performance issues

Search Server 2010 Express can experience bottlenecks in the crawl sub-system as the system moves through the index acquisition, maintenance, and deletion phases. To effectively troubleshoot crawl performance issues, you should use the Search Health Monitoring Reports together with the Common bottlenecks and their causes section to isolate crawl issues.

Troubleshooting during the index acquisition phase

The first place to discover crawl issues is the Crawl Rate Per Content Source health report. In general, the full crawl rate should be more than 35 documents/sec for all types of content sources.When a content source with a suboptimal crawl rate is identified, we recommend the following steps:

  1. Pause all other crawls except the content source under investigation. Did the crawl rate improve beyond the specified 15 to 35 docs/sec goal?

  2. If pausing all other crawls does not help, ensure that the repository that is being crawled is responsive enough and is not the cause of slow crawl. For more information, see Crawl bottleneck on repository.

  3. If the repository is not the bottleneck, determine the bottleneck in the indexer or database servers and optimize around them. For more information, see the sections Crawl IOPS Bottleneck and Crawl CPU Thread Bottleneck.

Troubleshooting during Content Maintenance phase

The primary goal during the content maintenance phase is to keep the search index as fresh as possible. Two of the key indicators for this are index freshness and incremental crawl speed.

  1. Index freshness: Are the crawls finishing in their budgeted time and in compliance with the IT guidelines for index freshness?

  2. Incremental crawl speed: If the index freshness goal is not met, investigate if the incremental crawl speeds are 25 docs/sec for the content sources. If the incremental crawl speeds are suboptimal, a bottleneck analysis should be performed on the crawled repository and the crawl subsystem. For more information, see Crawl bottleneck on repository.

Common bottlenecks and their causes

During performance testing, several common bottlenecks were revealed. A bottleneck is a condition in which the capacity of a particular constituent of a farm is reached. This causes a plateau or decrease in farm throughput.

The following table lists some common bottlenecks and describes their causes and possible resolutions.

Bottleneck Symptom (Performance Counter data) Resolution

Database RAM

Property database, Search administration database exhibit:

  • SQL Server Buffer Manager/ Page Life Expectancy lower than 300(ms) (should be more than 1000 (ms))

  • SQL Server Buffer Manager/ Buffer Cache Hit Ratio lower than 96 percent (should be more than 98 percent)

  • Add more memory to the database server, if it is separate.

  • Defragment the property database, if the weekly defrag rule is disabled.

  • Ensure you are using SQL Server 2008 Enterprise Edition, to enable page compression.

Database server IOPS

A property database or crawl database exhibits:

  • Average disc sec/read and Average disc sec/write approximately 50 ms or more than 50 ms

  • Increase the dedicated number of IOPS for the database:

  • Run SharePoint Health Analyzer (SPHA) property database defragment rule, if it is disabled.

  • Run SPHA crawl database defragment rule.

  • Ensure you are using SQL Server 2008 Enterprise Edition, to enable page compression.

Query component IOPS

The logical disk used for a query component’s index exhibits:

  • Average disc sec/read and Average disc sec/write approximately 30 ms or more than 30 ms for a sustained time (that is, most of the day, not merely during an index merge).

  • Increase the dedicated number of IOPS for the drive used for the query component’s index:

    • Use different storage arrays for different components.

    • Optimize your storage configuration; for example, by adding spindles (disk drives) to the storage array.