Performance and capacity monitoring (FAST Search Server 2010 for SharePoint)

Article
07/22/2014

Applies to: FAST Search Server 2010

When monitoring the performance and capacity of a Microsoft FAST Search Server 2010 for SharePoint deployment, there are two main areas of analysis:

Crawling and indexing performance analysis
Query performance analysis

For more information about how to monitor a FAST Search Server 2010 for SharePoint farm in general, refer to Monitor FAST Search Server 2010 for SharePoint. For more information about how to monitor a Microsoft SharePoint Server 2010 farm overall, refer to Monitoring and maintaining SharePoint Server 2010.

For an overview of the relevant performance counters for FAST Search Server 2010 for SharePoint, refer to Performance counters (FAST Search Server 2010 for SharePoint).

Note

This article assumes that you use the SharePoint Server 2010 crawler, indexing connector framework and the FAST Search Server 2010 for SharePoint Content Search Service Application (Content SSA) to crawl content.

Crawling and indexing performance analysis

Content processing chain

The item processing chain in FAST Search Server 2010 for SharePoint consists of the following components that may run on separate servers:

Crawler(s) Any component pushing content into FAST Search Server 2010 for SharePoint
Content distributor(s) Receives content in batches and distributes the batches to item processing and indexing
Document processors (item processing) Converts items to a unified internal format
Indexing dispatcher(s) Distributes processed item batches to index columns
Primary indexer Generates the searchable index
Backup indexer Persists a backup of the information in the primary indexer

Content flows as indicated by arrows 1–5 in the item processing chain figure. The last flow from primary to backup indexer is an optional deployment choice. Asynchronous callbacks for completed processing are propagating in the other direction as indicated by arrows 6 through 9. Crawlers throttle the crawl rate based on the callbacks (9) it receives for document batches (1). The slowest component in this chain determines the overall crawl rate.

Each FAST Search Server 2010 for SharePoint farm has one or more content distributors. These components receive all content in batches, which they pass on to the item processing components. You can ensure good performance by verifying that the following conditions are true:

Item processing components are effectively utilized
Incoming content batches are quickly distributed for processing

You can only achieve maximum throughput when the Content SSA has a constant queue of batches to submit. Each item processing component will use 100% of a CPU core when busy. You can scale item processing components up to one per CPU core.

Indexers are the most write intensive component in a FAST Search Server 2010 for SharePoint installation, and you must ensure that you have high disk performance. High indexing activity can also affect query matching operations when it runs on the same row. Indexers distribute the items across several partitions. Partition 0, and up to three of the other partitions, can have ongoing activity at the same time. During redistribution of items among partitions, one or more partitions might be waiting for other partitions to reach a specific checkpoint.

Tip

Analysis of crawling and indexing performance can be done through several tools; for example the "Performance monitor" of Windows Server 2008 [R2], or on Systems Center Operations Manager (SCOM). In addition, you can retrieve the indexer status using the indexerinfo command, for example indexerinfo -a status.

Crawling performance counters

The following table shows the most important performance counters for the Content SSA. Note that you find these performance counters on the server(s) hosting the Content SSA crawl components, under "OSS Search FAST Content Plugin", and not in the FAST Search Server 2010 for SharePoint farm.

Performance counter	Description
Batches ready	The number of batches that the Content SSA has retrieved from the content sources, and that are ready for passing on to the content distributor. When zero, the FAST Search Server 2010 for SharePoint farm back-end is processing content faster than the Content SSA can crawl.
Batches submitted	The number of batches that the Content SSA has sent to FAST Search Server 2010 for SharePoint, and for which a callback is still pending. When zero, nothing was sent to the FAST Search Server 2010 for SharePoint farm back-end for processing.
Batches open	The total number of batches in some stage of processing.
Items Total	The total number of items crawled by the Content SSA since last service restart.
Available Mbytes	The total amount of available memory on the computer. By default, the Content SSA will stop aggregating batches ready when 80% system memory is used
Processor time	Overall CPU usage on the computer. High CPU load could limit the throughput of the Content SSA.
Bytes Total/sec	Overall network usage on the computer. High network load might become a bottleneck for the rate of data that the Content SSA can crawl and push to the FAST Search Server 2010 for SharePoint farm.

Content distributors and item processing performance counters

The following table shows the performance counters for the content distributors and item processing. For a total overview of the system, you must sum up these performance counters across all content distributors.

Performance counter	Description
Document processors	The number of item processing components registered with each content distributor. When having multiple content distributors, the item processing components will be evenly distributed across the content distributors.
Document processors busy	The number of item processing components that are currently working on a content batch. This should be close to the total number of item processing components under maximum load.
Average dispatch time	The time that the content distributor uses to send a batch to an item processing component. This should be less than 10ms. Higher values indicate a congested network.
Average processing time	The time that an item processing component uses to process a batch. This time can vary depending on content types and batch sizes, but would typically be less than 60 seconds.
Available Mbytes	The total amount of available memory on the computer. Each item processing component might need up to 2GB of memory. Memory starvation will affect processing throughput.
Processor time	Overall CPU usage on the computer. Item processing components are CPU intensive. Typically, the CPU utilization is high during crawls but item processing has reduced priority and will yield CPU resources to other components when it is required.
Bytes Total/sec	Overall network usage on the computer. High network load might become a bottleneck for the rate of data that can be processed by the FAST Search Server 2010 for SharePoint servers.

Indexing dispatcher and indexers performance counters

The following table shows the most important performance counters for the indexing dispatcher and indexers.

Performance counter	Description
Current queue size	Indexers queue incoming work under high load. This is common, especially for partial updates. If API queues never (intermittently) reaches zero, the indexer is the bottleneck. The crawler will pause when the queue reaches 256MB in one of the indexers. This can occur if the storage subsystem is not sufficiently powerful. It will also occur during large redistribution of content between partitions, which temporarily blocks more content from being indexed.
FiXML fill rate	FiXML files (internal item storage in the indexers) are compacted regularly, by default between 3am and 5am every night. Low FiXML fill rate (<70%) will lead to inefficient operation.
Active documents	Partitions 0 and 1 should have less than 1 million items each, preferably even less in order to keep indexing latency low. In periods with high item throughput the indexing latency will be higher because partition 0 and 1 will be larger as this is more optimal for overall throughput. The query matching component will automatically rearrange the items into the higher numbered partitions during periods with lighter load.
% Idle Time	Low disk idle time suggest a saturated storage subsystem.
% Free space	Indexers need space for both the index generation currently used by the query matching component and new index generations that are under processing. On a fully loaded system, disk usage will vary between 40% and near 100% for the same number of items, depending on the status of the indexer.

Query performance analysis

SharePoint Server 2010 administrative reports provide useful statistics for query performance from an end-to-end perspective. These reports are effective for tracing trends over time and identifying where you should investigate when performance is not optimal.

Query latency can occur in three different areas: Web server rendering, Query SSA processing and back-end processing. In general, latency due to web server page rendering and Query SSA latency occur on the servers that are running SharePoint Server 2010 (Query SSA). These latencies are also dependent on the performance of the SQL server(s) backing the SharePoint Server 2010 installation. The back-end latency is within the FAST Search Server 2010 for SharePoint servers:

Query processing
Query matching

Queries are sent from the Query SSA to the FAST Search Server 2010 for SharePoint query processing component ("query" in the deployment file). In the reports, query processing appears as “QRproxy”, “QRserver” and “Fdispatch”. The Query SSA and the query processing component are likely to represent a bottleneck. Any difference between the two is because of communication delays or query processing.

The query dispatcher (known as "Fdispatch" in the reports) distributes queries across index columns. There is also a query dispatcher located on each query matching server, distributing queries across index partitions. Both query dispatchers may be a bottleneck when there are large amounts of data in the query results. This leads to network saturation. We recommend that you use a separate network switch for the communication between the query processing and query matching components.

The query matching component (known as "Fsearch" in the reports) is responsible for performing the actual matching of queries against the index, computing query relevancy and performing deep refinement. For each query, it reads the required information from the indices generated by the indexer. Information that is likely to be reused will be kept in a memory cache. Good query matching performance relies on a powerful CPU and low latency from small random disk reads (typically 16-64 kB).

Query processing performance counters

The following table shows the performance counters that can be helpful for correlating the back-end latency reported by the Query SSA and the query processing component.

Performance counter	Description
# Queries/sec	Current number of queries per second
# Requests/sec	Current number of requests per second. In addition to the query load, one internal request is received every second to check that QRserver is alive.
Average queries per minute	Average query load
Average latency last - ms	Average query latency
Peak queries per sec	Peak query load

Query matching performance counters

The following table shows the performance counters that can be helpful for analyzing a server that runs the query matching component.

Performance counter	Description
% Idle Time	Low disk idle time suggest a saturated storage subsystem.
Avg. Disk sec/Read	Each query will need a series of disk reads. An average read latency of less than 10 ms is desirable.
Avg. Disk Read Queue Length	On a saturated disk subsystem, read queues will build up. Queues will affect query latency. An average queue length smaller than 1 is desirable for any server that runs query components. This will typically be exceeded in single row deployments during indexing, adversely affecting search performance.
Processor time	CPU utilization is likely to become the bottleneck for high query throughput. When query matching has high processor time (near 100%), query throughput will be unable to increase more.