Recommendations: Content freshness (FAST Search Server 2010 for SharePoint)

FAST Search Server 2010

Applies to: FAST Search Server 2010

Topic Last Modified: 2011-09-29

This article describes our recommendations for achieving content freshness in a Microsoft FAST Search Server 2010 for SharePoint environment. For a complete recommendations overview, refer to Performance and capacity recommendations (FAST Search Server 2010 for SharePoint).

Content freshness depends on how fast your deployment can crawl and process content. You should first perform some benchmark testing of your deployment to determine how long it takes to crawl, process and index content, and then evaluate whether it interferes with your target user response times.

If you want to increase the content freshness you have the following options:

  • You can scale out the indexing connector. For more information about how to increase content freshness by redundancy, refer to Recommendations: Redundancy and availability (FAST Search Server 2010 for SharePoint).

  • If you are using the FAST Search Content SSA, you can increase the number of concurrent crawling requests.

    The Content SSA is based on a content pull approach, where the length of each crawl cycle determines the average time to discover changed content that must be indexed. You can increase the number of concurrent requests that the crawler generates when it crawls using crawler impact rules. The more concurrent requests, the faster you crawl. What you want to achieve is that all changes in the content repositories result in a re-indexed document. As long as you dimension the connector to be able to catch all updates, the load on the item processing and indexing will not increase even if you reduce the time that is required to crawl all content sources. For more information about crawler impact rules, refer to Manage crawler impact rules (FAST Search Server 2010 for SharePoint).

  • You can deploy the item processing component to all servers in your deployment to minimize the overhead related to document format parsing and extraction of searchable content and metadata. Crawling performance is mainly determined by the item processing capacity. Therefore, it is important that you deploy the item processing components in a way that utilizes spare CPU capacity across all servers.

    For the item processing component, do not deploy more than 20 item processors (<document-processor processes=”20” /> in deployment.xml).