Test results: Extra-large scenario (FAST Search Server 2010 for SharePoint)

Article
07/22/2014

Applies to: FAST Search Server 2010

With the extra-large Microsoft FAST Search Server 2010 for SharePoint test scenario, we targeted an extra-large test corpus. For this scenario, redundancy is crucial, because with this many computers, hardware failures are likely to occur. The content volume for the scenario was up to 500 million items. The content volume means that crawls will have to run during business hours.

We set up the parent Microsoft SharePoint Server 2010 farm with two front-end web servers, four application servers, and two database servers, and arranged them as follows:

We used the SharePoint Server 2010 crawler, indexing connector framework, and the FAST Content Search Service Application (Content SSA) to crawl content. We distributed four crawl components for the Content SSA across the four application servers. This accommodates I/O limitations in the test setup (1 gigabit per second network), where four nodes provide a theoretically maximum crawl rate of 4 gigabits per second.
One of the application servers also hosted Central Administration for the farm.
One database server hosted the crawl databases.
One database server hosted the FAST Search Server 2010 for SharePoint administration databases and the other SharePoint Server 2010 databases.

We did not use any separate data storage, because the application servers and front-end web servers only needed space for operating system, application binaries and log files.

In this article:

Test deployment
Test characteristics
Test results

Test deployment

Within the extra-large scenario, we tested one FAST Search Server 2010 for SharePoint deployment, with service pack 1 (SP1):

Name	Description
XL1	Two rows, twelve columns setup, with one additional administration node (25 servers in total). The extended capacity mode can support up to 40 million items per server.

Test characteristics

This section provides detailed information about the hardware, software, topology and configuration of the test environment.

Hardware/Software

We tested the specified deployment by using the following hardware and software, although the administration node does not need this much disk space.

FAST Search Server 2010 for SharePoint servers

Windows Server 2008 R2 x64 Enterprise Edition
2x Intel L5640 CPUs with Hyper-threading and Turbo Boost switched on
48 GB memory
1 gigabits per second network card
Storage subsystem:
- OS: 2x 146GB 10k RPM 2.5” SAS disks in RAID1
- Application: 12x 1 terabyte 7200 RPM 6 gigabits per second 3.5” SAS disks in RAID10. Total formatted capacity of 5.5 terabytes.
- Disk controller: Dell PERC H700 Integrated, firmware 12.10.0-0025
- Disks: Seagate Constellation ES ST31000424SS, firmware KS68

SharePoint Server 2010 servers

Windows Server 2008 R2 x64 Enterprise edition
2x Intel L5420 CPUs
16 GB memory
1 gigabit per second network card
Storage subsystem for OS/Programs: 2x 146GB 10k RPM SAS disks in RAID1

SQL servers

General SQL server: Same specification as for SharePoint Server 2010 servers, but with additional disk RAID for SQL data with 6x 146GB 10k RPM SAS disks in RAID5.
Crawl database SQL server:
- Windows Server 2008 R2 x64 Enterprise edition
- 2x Intel X5670 CPUs with Hyper-threading and Turbo Boost switched on
- 48 GB memory
- 1 gigabit per second network card
- Storage subsystem:
  - OS: 2x 146GB 10k RPM 2.5” SAS disks in RAID1
  - Application: 12x SAS 15k RPM 600GB 3.5" SAS disks in RAID50. Total formatted capacity 5.6 terabytes.

Topology

This section describes the topology of all the test deployments.

XL1

XL1 is a setup with two rows and twelve columns, with an additional administration node. The second row adds query throughput capacity, query redundancy, and gives better separation of query and feeding load. The deployment includes three query processing components across the administration node (fsadmin.contoso.com), and two query processing components in the second row (fsr1coo.contoso.com and fsr1c01.contoso.com). The Query SSA is configured to only use the query processing components on the search row during normal operation. The query component on the administration node may be reconfigured for use as a fallback. It can then serve queries in conjunction with the first search row if the second search row is taken down for maintenance.

The following figure shows the XL deployment.

XL topology illustration

We used the following deployment.xml file to set up XL1.

<?xml version="1.0" encoding="utf-8" ?>
<deployment version="14" modifiedBy="
  <deployment version="14"
   modifiedBy="contoso\user" modifiedTime="2011-01-01T12:00:00+00:00" comment="XL1"
   xmlns="https://www.microsoft.com/enterprisesearch"
   xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
   xsi:schemaLocation="https://www.microsoft.com/enterprisesearch deployment.xsd">

  <instanceid>XL1</instanceid>
  <connector-databaseconnectionstring>[<![CDATA[jdbc:sqlserver://sqlbox.contoso.com\sql:1433;DatabaseName=XL1.jdbc]]></connector-databaseconnectionstring>

  <!-- Admin -->
  <host name="fsadmin.contoso.com">
   <admin />
 <query />
<content-distributor />
    <indexing-dispatcher />
<document-processor processes="8" />
  </host>

  <!-- Row 0 -->
  <host name="fsr0c00.contoso.com">
<content-distributor />
<searchengine row="0" column="0" />
<document-processor processes="12" />
  </host>

  <host name="fsr0c01.contoso.com">
<content-distributor />  
    <searchengine row="0" column="1" />
<document-processor processes="12" />
  </host>

  <host name="fsr0c02.contoso.com">
<content-distributor />  
    <searchengine row="0" column="2" />
<document-processor processes="12" />
  </host>
  
  <host name="fsr0c03.contoso.com">
    <searchengine row="0" column="3" />
<document-processor processes="12" />
<webanalyzer server="true" link-processing="true" lookup-db="true" max-targets="2" redundant-lookup="true" />
  </host>
  
  <host name="fsr0c04.contoso.com">
    <searchengine row="0" column="4" />
<document-processor processes="12" />
<webanalyzer server="false" link-processing="true" lookup-db="true" max-targets="2"/>
  </host>
  
  <host name="fsr0c05.contoso.com">
    <searchengine row="0" column="5" />
<document-processor processes="12" />
<webanalyzer server="false" link-processing="true" lookup-db="true" max-targets="2"/>
  </host>
  
  <host name="fsr0c06.contoso.com">
    <searchengine row="0" column="6" />
<document-processor processes="12" />
<webanalyzer server="false" link-processing="true" lookup-db="true" max-targets="2"/>
  </host>
  
  <host name="fsr0c07.contoso.com">
    <searchengine row="0" column="7" />
<document-processor processes="12" />
<webanalyzer server="false" link-processing="true" lookup-db="true" max-targets="2"/>
  </host>
  
  <host name="fsr0c08.contoso.com">
    <searchengine row="0" column="8" />
<document-processor processes="12" />
<webanalyzer server="false" link-processing="true" lookup-db="true" max-targets="2"/>
  </host>
  
  <host name="fsr0c09.contoso.com">
    <indexing-dispatcher />
    <searchengine row="0" column="9" />
<document-processor processes="12" />
<webanalyzer server="false" link-processing="true" lookup-db="true" max-targets="2"/>
  </host>
  
  <host name="fsr0c10.contoso.com">
    <indexing-dispatcher />
    <searchengine row="0" column="10" />
<document-processor processes="12" />
<webanalyzer server="false" link-processing="true" lookup-db="true" max-targets="2"/>
  </host>
  
  <host name="fsr0c11.contoso.com">
    <indexing-dispatcher />
    <searchengine row="0" column="11" />
<document-processor processes="12" />
<webanalyzer server="false" link-processing="true" lookup-db="true" max-targets="2"/>
  </host>


  <!-- Row 1 -->
  <host name="fsr1c00.contoso.com">
<query />
<searchengine row="1" column="0" />
<document-processor processes="8" />
  </host>

  <host name="fsr1c01contoso.com">
   <query />
    <searchengine row="1" column="1" />
<document-processor processes="8" />
  </host>

  <host name="fsr1c02.contoso.com">
    <searchengine row="1" column="2" />
<document-processor processes="8" />
  </host>
  
  <host name="fsr1c03.contoso.com">
    <searchengine row="1" column="3" />
<document-processor processes="8" />
  </host>
  
  <host name="fsr1c04.contoso.com">
    <searchengine row="1" column="4" />
<document-processor processes="8" />
  </host>
  
  <host name="fsr1c05.contoso.com">
    <searchengine row="1" column="5" />
<document-processor processes="8" />
  </host>
  
  <host name="fsr1c06.contoso.com">
    <searchengine row="1" column="6" />
<document-processor processes="8" />
  </host>
  
  <host name="fsr1c07.contoso.com">
    <searchengine row="1" column="7" />
<document-processor processes="8" />
  </host>
  
  <host name="fsr1c08.contoso.com">
    <searchengine row="1" column="8" />
<document-processor processes="8" />
  </host>
  
  <host name="fsr1c09.contoso.com">
    <searchengine row="1" column="9" />
<document-processor processes="8" />
  </host>
  
  <host name="fsr1c10.contoso.com">
    <searchengine row="1" column="10" />
<document-processor processes="8" />
  </host>
  
  <host name="fsr1c011.contoso.com">
    <searchengine row="1" column="11" />
<document-processor processes="8" />
  </host>
  
  <searchcluster>
      <row id="0" index="primary" search="true" />
      <row id="1" index="secondary" search="true" />
  </searchcluster>

</deployment>

Dataset

This section describes the test farm dataset: The database content and sizes, search indexes, and external data sources. The Content SSA was configured to use 12 crawl databases.

The following table shows the overall metrics.

Object	Value
Search index size	503 million items
Size of crawl database	2.8 terabytes
Size of crawl database log file	1.6 terabytes
Size of property database	0.1 terabytes
Size of property database log file	2 gigabytes (GB)
Size of SSA administration database	35 gigabytes (GB)

The next table shows the content source types we used to build the index. The numbers reflect the total number of items per source and include replicated copies. The difference between the total number of items and the index size can have two reasons:

Items may have been disabled from indexing in the content source, or
The document format type could not be indexed.

For SharePoint sources, the size of the respective content database in SQL represents the raw data size.

Content source	Items	Raw data size	Average size per item
File share 1 (12 copies)	7.2 million	924 gigabytes (GB)	128 kilobytes (KB)
File share 2 (12 copies)	176 million	40 terabytes	229 kilobytes (KB)
SharePoint 1 (12 copies)	54 million	24 terabytes	443 kilobytes (KB)
SharePoint 2 (12 copies)	54 million	24 terabytes	443 kilobytes (KB)
SharePoint 3 (12 copies)	54 million	24 terabytes	443 kilobytes (KB)
HTML 1 (12 copies)	13 million	105 gigabytes (GB)	8.1 kilobytes (KB)
HTML 2 (12 copies)	38 million	1.7 TB terabytes	43 kilobytes (KB)
HTML 3 (12 copies)	122 million	5.9 terabytes	49 kilobytes (KB)
Total	518 million	121 terabytes	233 kilobytes (KB)

To reach sufficient content volume in the testing of the large scenario, we added replicas of the data sources. Each copy of each document appeared as a unique item in the index, but they were treated as duplicates by the duplicate trimming feature. From a query matching perspective, the load was similar to having all unique documents indexed, but any results from these sources triggered duplicate detection and collapsing in the search results.

Note

The extra-large test scenario did not include people search data.

Test results

This section describes how the various deployments performed under load: Crawling and indexing performance, query performance and disk usage.

Crawling and indexing performance

The extra-large scenario deployment was limited by the bandwidth of the content sources. Crawl rates were averaging around 250 items per second. This would have corresponded to 25 days of constant crawling. In testing, crawls were spilt up in blocks of around 43 million items, allowing for intermediate query testing at less than full capacity. Crawl pauses were also needed for maintenance periods, for example Windows Update and replacement of failed hardware.

Query performance

The graph below shows the query performance under various conditions. At full capacity of 500 million items, and with no ongoing indexing, the farm was able to sustain about 15 queries per second (QPS) with less than 1 second average latency before being limited by the CPU resources on the search row. During incremental crawls and full crawls, this number dropped to 12.5 and 10 QPS respectively. Additional testing was done before reaching full capacity. The red lines in the graph are measured when the system was half full. Without any feed, 27 QPS is possible. Under feed, there are fewer differences between the 250 million and 500 million item states. This is expected, because incoming feeds require certain amounts of network, CPU and disk capacity regardless of how many items that are already in the index.

XL Query Performance

The next graph shows the query latency during a computer failure, taking one of the columns in the search row offline. Query load is around 10 QPS, with latency plotted as a function of time. The primary row will automatically provide failover query matching for the failed column, but with a performance impact since it also needs to serve ongoing indexing. During this degraded period, query latency is on average around 1.4 seconds, recovering to around 1 second halfway through the plot when the failed computer is brought back online.

XL Query Latency

Disk usage

The following table shows the combined disk usage on all nodes in the deployment.

Content source	FiXML data size	Index data size	Other data* size	Total disk usage
Administration node	0	0	33 gigabytes (GB)	33 gigabytes (GB)
Row 0 (across 12 computers)	6.2 terabytes	15.8 terabytes	40 gigabytes (GB)	22.0 terabytes
Row 1 (across 12 computers)	6.4 terabytes	15.8 terabytes	40 gigabytes (GB)	22.2 terabytesTB
Total	12.6 terabytes	31.6 terabytes	113 gigabytes (GB)	44.2 terabytes

* Logs and intermediate data for handling fault recovery on computer failures