Plan for redundancy and availability (FAST Search Server 2010 for SharePoint)(informazioni in lingua inglese)

Data pubblicazione: 12 maggio 2010

This article describes the options for scaling out redundant server roles included in a Microsoft FAST Search Server 2010 for SharePoint farm. After reading this article, you will be able to identify the redundancy options that are appropriate for the environment.

In this article:

  • About redundancy and availability

  • Server farms and Search Service Applications

  • Redundancy and availability for the Query SSA

  • Redundancy and availability for the Content SSA

  • Redundant components within the FAST Search for SharePoint farm

  • Different levels of high availability

About redundancy and availability

The term redundancy is often misinterpreted to be synonymous with availability. While these concepts are related, they are not the same. Redundancy refers to the use of multiple servers in a load-balanced environment for any of several purposes, such as to improve farm performance, to scale out to accommodate additional users, and to improve availability.

Availability is a more specialized concept that refers to a multiple-server environment that is designed to accept connections and operate normally even when one or more of the servers in the farm are not operational. Availability implies redundancy, and additionally implies a failover mechanism and several other possible characteristics. A redundant system, however, might not be highly available.

FAST Search Server 2010 for SharePoint supports scalable server farms for capacity, performance and availability. Typically, capacity is the first consideration in determining the number of server computers to start with. After factoring in performance, availability also plays a role in determining both the number of servers and the size or capacity of the server computers in a server farm.

After reading this article, you will be able to decide if you need to build expandable capacity into the server deployment topology by deploying redundant servers, or if it makes sense for the organization to plan for a limited server deployment without redundant servers.

Server farms and Search Service Applications

FAST Search Server 2010 for SharePoint provides enterprise search infrastructure for the Microsoft SharePoint Server farm infrastructure, and the search solution consists of four main parts, that may be located in different server farms:

  • The FAST Search Server 2010 for SharePoint farm   This is a dedicated server farm infrastructure that provides the back-end indexing and search capabilities for the enterprise search solution. In order to provide an end-to end search solution, a FAST Search Server 2010 for SharePoint farm must be associated with a parent SharePoint Server 2010 farm through the FAST Search Query SSA and FAST Search Content SSA.

  • FAST Query Search Service Application (SSA)   This is a search service application (SSA) in the parent SharePoint Server 2010 farm that provides the query-side integration between the FAST Search Server 2010 for SharePoint farm and the parent SharePoint Server farm.

  • FAST Content Search Service Application (SSA)   This is a search service application (SSA) in a SharePoint Server farm that enables retrieving content for indexing from content repositories. This SSA represents the default indexing connector for your FAST Search Server 2010 for SharePoint deployment.

  • SQL Server database   A FAST Search Server 2010 for SharePoint farm must have access to a Microsoft SQL Server host that is used for storing configuration information. Normally you use an existing SQL Server hostwithin the associated SharePoint Server 2010 farm.

    Nota

    The FAST Search Server 2010 for SharePoint farm does not use the SQL Server database for indexing metadata properties. This is different from the default SharePoint Server 2010 search. The Query SSA includes a separate index used for people search. This index stores the metadata properties related to user profiles in an SQL Server database.

    For more information about redundancy and availability for the SQL Server within the SharePoint Server farm, see Pianificazione e configurazione dell'archiviazione e della capacità di SQL Server (SharePoint Server 2010). Note that the reference to "search" in that topic refers to SharePoint Server search, which also uses the database as a property store for the index.

For more information about the farm topology, see Plan search topology (FAST Search Server 2010 for SharePoint)(informazioni in lingua inglese).

Redundancy and availability for the Query SSA

The FAST Search Query SSA provides the query-side integration between the FAST Search Server 2010 for SharePoint farm and the parent SharePoint Server 2010 farm. The Query SSA also provides the People Search functionality for queries using a separate index.

You must deploy the FAST Query SSA on the parent SharePoint Server 2010 farm.

You can scale out the Query SSA for query redundancy and availability by adding additional query components within the SSA.

Ff599525.Important(it-it,office.14).gifImportante:
Do not deploy more than one Query SSA associated with your FAST Search Server 2010 for SharePoint farm.

For more information about how to add a query component, see Multiple server deployment of the Query SSA (FAST Search Server 2010 for SharePoint)(informazioni in lingua inglese).

The Query SSA also includes a crawl component that retrieves and indexes user profiles for people search. In most cases you will not need to scale out this component for people search performance, but you may add an additional crawl component within the SSA for people search redundancy.

For general information about redundancy and availability for the application servers within a SharePoint Server farm, see Pianificare la disponibilità (SharePoint Server 2010).

Ff599525.Important(it-it,office.14).gifImportante:
The Query SSA use SQL Server for crawl database and metadata property store for user profiles. This is used for people search.

Redundancy and availability for the front-end Web servers

You should plan redundancy and availability for the front-end Web servers according to the guidelines in Pianificare la disponibilità (SharePoint Server 2010).

You can deploy front-end Web servers in the parent SharePoint Server 2010 farm, or in child farms. In the latter case, you connect the front-end Web servers via a SSA Proxy in the parent farm.

Redundancy and availability for the Content SSA

The FAST Search Content SSA enables retrieving content for indexing from content repositories. This SSA represents the default indexing connector for your FAST Search Server 2010 for SharePoint deployment.

You will normally deploy the FAST Content SSA on the parent SharePoint Server 2010 farm.

You can scale out the Content SSA for feeding redundancy and availability by adding additional crawl components within the SSA.

Ff599525.Important(it-it,office.14).gifImportante:
Do not deploy more than one Content SSA associated with your FAST Search Server 2010 for SharePoint farm.

For more information about how to add a crawl component, see Multiple server deployment of the Content SSA (FAST Search Server 2010 for SharePoint)(informazioni in lingua inglese).

For general information about redundancy and availability for the application servers within a SharePoint Server 2010 farm, see Pianificare la disponibilità (SharePoint Server 2010).

Redundant components within the FAST Search for SharePoint farm

The following components within a FAST Search Server 2010 for SharePoint farm support redundancy:

  • Content Distributor   This is a stateless component that can be duplicated to achieve performance scaling and high availability. Each content distributor will handle feeding flow control for a subset of item batches.

    If one content distributor server fails, the outstanding item batches associated with this content distributor will fail, and the flow control protocol ensures that the item batches are re-submitted by the associated indexing connector.

  • Item Processing   You can scale out item processing by distributing a variable number of item processing instances (each running one processor thread) to one or more servers in the farm. Each item processing instance handles a given set of item batches.

    If one server running item processing fails, the outstanding item batches associated with the item processing instances running on this server will fail, and the flow control protocol ensures that the item batches are re-submitted by the associated indexing connector.

  • Link analysis (Web analyzer)   You can scale out the Web analyzer to handle link analysis for many items with many cross links. The Web analyzer operates in a batch processing mode, and is able to distribute link analysis jobs to multiple servers running lookup database and link processing components.

    If one server running link analysis fails with unrecoverable disk errors, you will need to re-establish the Web analyzer link database from the latest backup. If no backup exists, the ranking based on link analysis will be incomplete until a complete re-crawl has occurred.

    You can control redundancy for the lookup database during deployment, by configuring redundant-lookup for the Web analyzer in the deployment configuration file. If you lose a lookup database component without redundancy enabled, this will block the feeding of new items.

  • Indexing Dispatcher   This is a stateless component that can be duplicated to achieve feeding performance scaling and high availability. Each indexing dispatcher will handle feeding flow control for a subset of item batches.

    If one indexing dispatcher server fails, the remaining item batches associated with this indexing dispatcher will fail, and the flow control protocol ensures that the item batches are re-submitted by the associated indexing connector.

  • Indexing   You can scale out the indexing component by defining multiple index columns.

    Within each index column you can setup a backup indexer for high availability.

    If one indexer server fails with unrecoverable disk errors, you can either recover the server farm from the latest backup, or manually enable a backup indexer to be the new primary indexer.

    For more information, see Search Cluster.

  • Query Matching   You can scale out the query matching component in two dimensions. Each index column will have at least one associated query matching server. By defining multiple search rows you can scale out the farm for query performance and high availability.

    If one query matching server fails, the remaining queries being dispatched to this search row will fail, and subsequent queries will be handled by another search row in this index column.

    For more information, see Search Cluster.

  • Query Processing   This is a stateless component that can be duplicated to achieve query performance scaling and high availability. Each query processing server will handle a subset of the queries.

    If one query processing server fails, the remaining queries associated with this query processing server will fail, and subsequent queries will be handled by another query processing server.

  • FAST Search specific indexing connectors   The FAST Search Web Crawler is an alternative indexing connector that is recommended for certain large-scale Web crawl use cases. You can scale out this component by deploying multiple node schedulers that will handle crawl scheduling of different parts of the overall crawl.

    The FAST Search Lotus Notes and FAST Search database indexing connectors are implemented as stand-alone components, each associated with one or more content repositories. This means that you can scale out the system by deploying multiple instances of the indexing connectors.

For more information about the components, see Plan FAST Search Server farm topology (FAST Search Server 2010 for SharePoint)(informazioni in lingua inglese).

Different levels of high availability

Choosing a high availability strategy depends on several factors related to requirements and budget:

  • High availability for all parts of the system will in most cases require more servers than given by the actual performance requirements.

  • Is it acceptable that the performance is decreased when an error occurs?

  • Do you want high availability only for important parts of the system, such as the ability to serve queries even if components in the feeding chain fail? This means a full content re-feed is still acceptable in these cases.

  • What is the maximum acceptable time to rebuild the index to the same status after a non-recoverable error in the feeding chain?

Query high availability

This is usually most important for a search solution. You can provide a high level of query availability by deploying two or more search rows and query processing servers. Depending on query performance requirements the search rows may be co-located with the primary or backup indexers.

A crawling or indexing failure may be recovered while search still is available, with some exceptions:

  • If a recovery to a previous backup is required, search will be down for the time it takes to recover the system from the backup. Depending on the size of the backup, this may take a significant amount of time.

    If you exclude the binary index from the backup, the backup/recovery time is substantially reduced. However, the time to recover the index becomes significantly longer, as the index must be rebuilt from the pre-index item store (FiXML files). However, FAST Search Administrators can keep the latest available index (before the error situation) on non-affected search rows available for queries until the index is rebuilt from the backup. This will therefore provide improved availability for search at the cost of a longer freshness delay before new content is made searchable after an error situation.

  • If you do not have end-to-end fault-tolerance in the feeding and indexing chain, and do not take data backups, you will have to re-crawl and re-index all content from the source repositories. During the re-crawl only content indexed after the re-crawl started will be available for search.

Indexer high availability

You can choose one of the following options to ensure high availability for the FAST Search Server 2010 for SharePoint content index:

  1. Deploy a backup indexer on a separate set of servers. This will achieve the best index availability. If a non-recoverable error situation occurs on the primary indexer, you must re-configure the backup indexer to become the new primary indexer. Search will be available during the period between the error situation and the re-configuration, assuming you have deployed more than one search row. Content will not be fed to the index during this period.

    The cost of this option is that you must deploy two indexer servers per index column, the overall indexing performance is substantially reduced due to the backup protocol overhead, and the backup indexer will have performance impact on the query matching running on this server.

  2. Perform a regular full data backup that includes the binary index. This ensures fast recovery, limited by the time that is required to copy the backup data. In large installations this may result in very large backups with corresponding long content feeding and indexing down-time during the regular backup.

    The disadvantage of a backup/recovery based solution for the index is that the data indexed after the last backup will no longer be searchable. The recovery procedure ensures that the indexing connectors will only re-feed content that was updated after the last backup.

    The FAST Search Server 2010 for SharePoint farm and the associated Content SSA in the parent SharePoint farm must be suspended during the time of the backup. Search will be available, but no new content can be indexed during this period.

  3. Perform a regular full data backup that excludes the binary index. This reduces the size of the backup and takes less time. However, the recovery time will be significantly longer, and it includes re-building the index. Because search is available during this period, this is a compromise between cost aspects and freshness requirements.

Vedere anche

Concetti

Plan FAST Search Server farm topology (FAST Search Server 2010 for SharePoint)(informazioni in lingua inglese)
Plan search service applications (FAST Search Server 2010 for SharePoint)(informazioni in lingua inglese)
Plan for performance and capacity (FAST Search Server 2010 for SharePoint)(informazioni in lingua inglese)
Plan the system backup and recovery strategy (FAST Search Server 2010 for SharePoint)(informazioni in lingua inglese)
Manage search topology (FAST Search Server 2010 for SharePoint)(informazioni in lingua inglese)

Cronologia delle modifiche

Data Descrizione Motivo

12 maggio 2010

Pubblicazione iniziale