Test results: Medium scenario (FAST Search Server 2010 for SharePoint)

Article
07/22/2014

Applies to: FAST Search Server 2010

With the medium Microsoft FAST Search Server 2010 for SharePoint test scenario we targeted a moderate test corpus. To meet freshness goals, incremental crawls were likely to occur during business hours. The amount of content for the scenario was up to 40 million items.

We set up the parent Microsoft SharePoint Server 2010 farm with two front-end web servers, two application servers and one database server, and arranged them as follows:

We used the SharePoint Server 2010 crawler, indexing connector framework and the FAST Content Search Service Application (Content SSA) to crawl content. We distributed two crawl components for the Content SSA across the two application servers, mainly to accommodate I/O limitations in the test setup (1 Gbit/s network), where a single network adapter would have been a bottleneck
One of the application servers also hosted Central Administration for the farm
The database server hosted the crawl databases, the FAST Search Server 2010 for SharePoint administration databases and the other SharePoint Server 2010 databases.

We used no separate data storage because the application servers and front-end web servers only needed space for the operating system, application binaries and log files.

In this article:

Test deployments
Test characteristics
Test results

Test deployments

Within the medium scenario, we tested the following FAST Search Server 2010 for SharePoint deployments:

Name	Description
M1	One combined administration and web analyzer server, and three index column servers that had default configuration (4 servers in total)
M2	Same as M1, but with SAN storage
M3	A stand-alone high capacity server that hosted all FAST Search Server 2010 for SharePoint components
M4	Same as M1, with the addition of a dedicated search row (7 servers in total)
M5	Same as M3, with the addition of a dedicated search row (2 servers in total)
M6	Same as M4, but where the search row included a backup indexer
M7	Same as M5, but where the search row included a backup indexer
M8	Same as M3, but using solid state drives
M9	Same as M3, but on more powerful hardware
M10	Same as M1, but using solid state drives for the indexer and search servers

Note

We configured the index capacity of the M3, M5, M7, M8 and M9 deployments, allowing for up to 40 million items indexed on one server. For more information, refer to Performance and capacity tuning (FAST Search Server 2010 for SharePoint).

Test characteristics

This section provides detailed information about the hardware, software, topology and configuration of the test environment.

Hardware/Software

We tested the specified deployments using the following hardware and software.

FAST Search Server 2010 for SharePoint servers

Windows Server 2008 R2 x64 Enterprise Edition
2x Intel L5520 CPUs with Hyper-threading and Turbo Boost switched on
24 GB memory
1 Gbit/s network card
Storage subsystem:
- OS: 2x 146GB 10k RPM SAS disks in RAID1
- Application: 18x 146 GB 10k RPM SAS disks in RAID50 (two parity groups of 9 drives each). Total formatted capacity of 2 TB.
- Disk controller: HP Smart Array P410, firmware 3.00
- Disks: HP DG0146FARVU, firmware HPD5

Variations:

Deployment	Variation description
M2	Application hosted on 2TB partitions on a SAN SAN used for test: 3Par T-400 240 15k RPM spindles (450GB each) Dual ported FC connection to each application server that uses MPIO without any FC switch. MPIO enabled in the operating system.
M3/M5	48 GB memory Application: 22x300GB 10k RPM SAS drives in RAID50 (two parity groups of 11 spindles each). Total formatted capacity of 6TB.
M7	2x Intel L5640 CPUs with Hyper-threading and Turbo Boost switched on 48 GB memory Dual 1 Gbit/s network card Storage subsystem: Application: 12x 1TB 7200 RPM SAS drives in RAID10. Total formatted capacity of 6TB. Disk controller: Dell PERC H700, firmware 12.0.1-0091 Disks: Seagate Constellation ES ST31000424SS, firmware KS65
M8	2x Intel L5640 CPUs with Hyper-threading and Turbo Boost switched on 48 GB memory Dual 1 Gbit/s network card Storage subsystem: Application: 3x 1280 GB SSD cards in RAID0. Total formatted capacity of 3.6 TB. SSD cards: Fusion-IO ioDrive Duo 1.28 TB MLC, firmware revision 43284, driver 2.2 build 21459
M9	2x Intel X5670 CPUs with Hyper-threading and Turbo Boost switched on 48 GB memory Dual 1 Gbit/s network card Storage subsystem: Application: 12x 600GB 15k RPM SAS drives in RAID50. Total formatted capacity of 6TB. Disk controller: LSI MegaRAID SAS 9260-8i, firmware 2.90-03-0933 Disks: Seagate Cheetah 15K.7 ST3600057SS, firmware ES62
M10	2x Intel L5640 CPUs with Hyper-threading and Turbo Boost switched on 48 GB memory Dual 1 Gbit/s network card Storage subsystem (indexing and query matching servers only): Application: 1x Fusion-IO ioDrive Duo 1.28 TB MLC SSD card, firmware revision 43284, driver 2.2 build 21459

SharePoint Server 2010 servers

Windows Server 2008 R2 x64 Enterprise edition
2x Intel L5420 CPUs
16 GB memory
1 Gbit/s network card
Storage subsystem for OS/Programs: 2x 146GB 10k RPM SAS disks in RAID1

SQL servers

Same specification as for SharePoint Server 2010 servers, but with additional disk RAID for SQL data with 6x 146GB 10k RPM SAS disks in RAID5.

Topology

This section describes the topology of all the test deployments.

Note

We used the same SharePoint Server 2010 and database server configuration as shown for M1/M2/M10 for all the tested deployments, but for convenience we only show the FAST Search Server 2010 for SharePoint farm topology in the other deployment descriptions.

M1/M2/M10

The M1, M2 and M10 deployments were similar except for the storage subsystem. We ran M1 on local disk, for M2 we used SAN storage and for M10 we used solid state storage for the search server farm. All three deployments had three index columns and one search row. They all had a separate administration server that also included the web analyzer components. We spread item processing out across all servers.

None of these three deployments had a dedicated search row. This implied that there would be a noticeable degradation in query performance during content crawling and indexing. This effect could be reduced by crawling and indexing in off-peak hours, or by limiting the number of item processing components to reduce the maximum crawl rate.

The following figure shows the M1 deployment. M2 and M10 had the same configuration.

M1 topology illustration

We used the following deployment.xml file to set up M1, M2 and M10.

<?xml version="1.0" encoding="utf-8" ?> 

<deployment version="14" modifiedBy="contoso\user" 

   modifiedTime="2009-03-14T14:39:17+01:00" comment="M1" 
   xmlns="https://www.microsoft.com/enterprisesearch" 
   xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" 
   xsi:schemaLocation="https://www.microsoft.com/enterprisesearch deployment.xsd"> 

   <instanceid>M1</instanceid> 

   <connector-databaseconnectionstring> 
      [<![CDATA[jdbc:sqlserver://sqlbox.contoso.com\sql:1433;DatabaseName=M1.jdbc]]> 
   </connector-databaseconnectionstring> 

   <host name="fs4sp1.contoso.com"> 
      <admin /> 
      <query /> 
      <webanalyzer server="true" link-processing="true" lookup-db="true" max-targets="4"/> 
      <document-processor processes="12" /> 
   </host> 

   <host name="fs4sp2.contoso.com"> 
      <content-distributor /> 
      <searchengine row="0" column="0" /> 
      <document-processor processes="12" /> 
   </host> 

   <host name="fs4sp3.contoso.com"> 
      <content-distributor /> 
      <searchengine row="0" column="1" /> 
      <document-processor processes="12" /> 
   </host> 

   <host name="fs4sp4.contoso.com"> 
      <indexing-dispatcher /> 
      <searchengine row="0" column="2" /> 
      <document-processor processes="12" /> 
   </host> 

   <searchcluster> 
      <row id="0" index="primary" search="true" /> 
   </searchcluster> 

</deployment>

The M3 deployment combined all components on one server. Running concurrent crawling, indexing and query load had the same impact as for M1/M2/M10. In addition, the reduced number of servers implied fewer items processing components, and therefore lower crawl rate.

The following figure shows the M3 deployment.

M3 topology illustration

We used the following deployment.xml file to set up the deployment.

<?xml version="1.0" encoding="utf-8" ?> 

<deployment version="14" modifiedBy="contoso\user" 

   modifiedTime="2009-03-14T14:39:17+01:00" comment="M3" 
   xmlns="https://www.microsoft.com/enterprisesearch" 
   xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" 
   xsi:schemaLocation="https://www.microsoft.com/enterprisesearch deployment.xsd"> 

   <instanceid>M3</instanceid> 

   <connector-databaseconnectionstring> 
      [<![CDATA[jdbc:sqlserver://sqlbox.contoso.com\sql:1433;DatabaseName=M3.jdbc]]> 
   </connector-databaseconnectionstring> 

   <host name="fs4sp1.contoso.com"> 
      <admin /> 
      <query /> 
      <content-distributor /> 
      <indexing-dispatcher /> 
      <searchengine row="0" column="0" /> 
      <webanalyzer server="true" link-processing="true" lookup-db="true" max-targets="4"/> 
      <document-processor processes="12" /> 
   </host> 

   <searchcluster> 
      <row id="0" index="primary" search="true" /> 
   </searchcluster> 

</deployment>

The M4 deployment had the same configuration as M1/M2/M10 with the addition of a dedicated search row. A dedicated search row adds query throughput capacity, introduces query redundancy, and provides better separation of query load and crawling and indexing load. Each of the three servers that were running the dedicated search row also included a query processing component (query). In addition, the deployment included a query processing component on the administration server (fs4sp1.contoso.com). The Query SSA does not use the latter query processing component during ordinary operation. But it may use it as a fallback to be able to serve queries if you take down the whole search row for maintenance.

The following figure shows the M4 deployment.

M4 topology illustration

We used the following deployment.xml file to set up the deployment.

<?xml version="1.0" encoding="utf-8" ?> 

<deployment version="14" modifiedBy="contoso\user" 

   modifiedTime="2009-03-14T14:39:17+01:00" comment="M4" 
   xmlns="https://www.microsoft.com/enterprisesearch" 
   xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" 
   xsi:schemaLocation="https://www.microsoft.com/enterprisesearch deployment.xsd"> 

   <instanceid>M4</instanceid> 

   <connector-databaseconnectionstring> 
      [<![CDATA[jdbc:sqlserver://sqlbox.contoso.com\sql:1433;DatabaseName=M4.jdbc]]> 
   </connector-databaseconnectionstring> 

   <host name="fs4sp1.contoso.com"> 
      <admin /> 
      <query /> 
      <webanalyzer server="true" link-processing="true" lookup-db="true" max-targets="4"/> 
      <document-processor processes="12" /> 
   </host> 

   <host name="fs4sp2.contoso.com"> 
      <content-distributor /> 
      <searchengine row="0" column="0" /> 
      <document-processor processes="12" /> 
   </host> 

   <host name="fs4sp3.contoso.com"> 
      <content-distributor /> 
      <indexing-dispatcher /> 
      <searchengine row="0" column="1" /> 
      <document-processor processes="12" /> 
   </host> 

   <host name="fs4sp4.contoso.com"> 
      <indexing-dispatcher /> 
      <searchengine row="0" column="2" /> 
      <document-processor processes="12" /> 
   </host> 

   <host name="fs4sp5.contoso.com"> 
      <query /> 
      <searchengine row="1" column="0" /> 
   </host> 

   <host name="fs4sp6.contoso.com"> 
      <query /> 
      <searchengine row="1" column="1" /> 
   </host> 

   <host name="fs4sp7.contoso.com"> 
      <query /> 
      <searchengine row="1" column="2" /> 
   </host> 

   <searchcluster> 
      <row id="0" index="primary" search="true" /> 
      <row id="1" index="none" search="true" /> 
   </searchcluster> 

</deployment>

The M5 deployment had the same configuration as M3 with the addition of a dedicated search row that provided the same benefits as M4 compared to M1/M2/M10.

The following figure shows the M5 deployment.

M5 topology illustration

We used the following deployment.xml file to set up the deployment.

<?xml version="1.0" encoding="utf-8" ?> 

<deployment version="14" modifiedBy="contoso\user" 

   modifiedTime="2009-03-14T14:39:17+01:00" comment="M5" 
   xmlns="https://www.microsoft.com/enterprisesearch" 
   xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" 
   xsi:schemaLocation="https://www.microsoft.com/enterprisesearch deployment.xsd"> 

   <instanceid>M5</instanceid> 

   <connector-databaseconnectionstring> 
      [<![CDATA[jdbc:sqlserver://sqlbox.contoso.com\sql:1433;DatabaseName=M5.jdbc]]> 
   </connector-databaseconnectionstring> 

   <host name="fs4sp1.contoso.com"> 
      <admin /> 
      <query /> 
      <content-distributor /> 
      <indexing-dispatcher /> 
      <searchengine row="0" column="0" /> 
      <webanalyzer server="true" link-processing="true" lookup-db="true" max-targets="4"/> 
      <document-processor processes="16" /> 
   </host> 

   <host name="fs4sp2.contoso.com"> 
      <query /> 
      <searchengine row="1" column="0" /> 
   </host> 

   <searchcluster> 
      <row id="0" index="primary" search="true" /> 
      <row id="1" index="none" search="true" /> 
   </searchcluster> 

</deployment>

The M6 deployment had the same configuration as M4 but with an additional backup indexer enabled on the search row. We deployed the backup indexer by modifying the M4 deployment.xml file as follows.

... 
<searchcluster> 
   <row id="0" index="primary" search="true" /> 
   <row id="1" index="secondary" search="true" /> 
</searchcluster> 
...

The M7 deployment had the same setup as M5, with an additional backup indexer enabled on the search row. M7 was also running on servers that had more CPU cores and enabled us to increase the number of item processing components in the farm.

We used the following deployment.xml to set up the deployment.

<?xml version="1.0" encoding="utf-8" ?> 

<deployment version="14" modifiedBy="contoso\user" 

   modifiedTime="2009-03-14T14:39:17+01:00" comment="M7" 
   xmlns="https://www.microsoft.com/enterprisesearch" 
   xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" 
   xsi:schemaLocation="https://www.microsoft.com/enterprisesearch deployment.xsd"> 

   <instanceid>M7</instanceid> 

   <connector-databaseconnectionstring> 
      [<![CDATA[jdbc:sqlserver://sqlbox.contoso.com\sql:1433;DatabaseName=M5.jdbc]]> 
   </connector-databaseconnectionstring> 

   <host name="fs4sp1.contoso.com"> 
      <admin /> 
      <query /> 
      <content-distributor /> 
      <indexing-dispatcher /> 
      <searchengine row="0" column="0" /> 
      <webanalyzer server="true" link-processing="true" lookup-db="true" max-targets="4"/> 
      <document-processor processes="20" /> 
   </host> 

   <host name="fs4sp2.contoso.com"> 
      <query /> 
      <searchengine row="1" column="0" /> 
      <document-processor processes="8" /> 
   </host> 

   <searchcluster> 
      <row id="0" index="primary" search="true" /> 
      <row id="1" index="secondary" search="true" /> 
   </searchcluster> 

</deployment>

M8/M9

Similar to M3, the M8/M9 deployments combined all components on one server. The differences were that M8/M9 were running on hardware with better performance, especially for the disk subsystem, and that they had an increased number of CPU cores that enabled more item processing components.

M8 used solid state storage. M9 had more CPU power (X5670 vs. L5520/L5640 that we used to test most of the other medium deployments) and the fastest disk spindles readily available (12x 15k RPM SAS disks).

We used the following deployment.xml file to set up both M8 and M9.

<?xml version="1.0" encoding="utf-8" ?> 

<deployment version="14" modifiedBy="contoso\user" 

   modifiedTime="2009-03-14T14:39:17+01:00" comment="M8" 
   xmlns="https://www.microsoft.com/enterprisesearch" 
   xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" 
   xsi:schemaLocation="https://www.microsoft.com/enterprisesearch deployment.xsd"> 

   <instanceid>M8</instanceid> 

   <connector-databaseconnectionstring> 
      [<![CDATA[jdbc:sqlserver://sqlbox.contoso.com\sql:1433;DatabaseName=M8.jdbc]]> 
   </connector-databaseconnectionstring> 

   <host name="fs4sp1.contoso.com"> 
      <admin /> 
      <query /> 
      <content-distributor /> 
      <indexing-dispatcher /> 
      <searchengine row="0" column="0" /> 
      <webanalyzer server="true" link-processing="true" lookup-db="true" max-targets="4"/> 
      <document-processor processes="20" /> 
   </host> 

   <searchcluster> 
      <row id="0" index="primary" search="true" /> 
   </searchcluster> 

</deployment

Dataset

This section describes the test farm dataset: The database content and sizes, search indexes and external data sources.

The following table shows the overall metrics.

Object	Value
Search index size (# of items)	42.7 million
Size of crawl database	138 GB
Size of crawl database log file	11 GB
Size of property database	<0.1 GB
Size of property database log file	0.3 GB
Size of SSA administration database	<0.1 GB

The next table shows the content source types we used to build the index. The numbers in the table reflect the total number of items per source and include replicated copies. Note that the difference between the total number of items and the index size can have two reasons:

Items may have been disabled from indexing in the content source, or
The document format type could not be indexed.

For SharePoint sources, the size of the respective content database in SQL represents the raw data size.

Content source	Items	Raw data size	Average size per item
File share 1 (2 copies)	1.2 M	154 GB	128 kB
File share 2 (2 copies)	29.3 M	6.7 TB	229 kB
SharePoint 1	4.5 M	2.0 TB	443 kB
SharePoint 2	4.5 M	2.0 TB	443 kB
HTML 1	1.1 M	8.8 GB	8.1 kB
HTML 2	3.2 M	137 GB	43 kB
Total	43.8 M	11 TB	251 kB

To reach sufficient content volume in the testing of the medium scenario, we added two replicas of the file shares. Each copy of each document then appeared as a unique item in the index, but they were treated as duplicates by the duplicate trimming feature. From a query matching perspective the load was similar to having all unique documents indexed, but any results from these sources triggered duplicate detection and collapsing in the search results.

Note

The medium test scenario did not include people search data.

Test results

This section provides data that shows how the various deployments performed under load: Crawling and indexing performance, query performance and disk usage.

Crawling and indexing performance

The following diagram shows the average number of items each deployment processed per second during a full crawl.

Full crawl performance graph

Note

We have only included test results from M1 through M6 in the diagram because the other deployments did not show significantly different crawling and indexing performance than their corresponding counterparts.

For full crawls, the item processing components represented the bottleneck. The limiting factor was the CPU processing capacity.

M1, M2 and M4 had similar performance characteristics because they had the same number of item processing components available. The same applied to M3 and M5, but notice that M1, M2 and M4 had four times increased crawl performance during a full crawl. The reason for this was that they had four times the item processing capacity compared to M3 and M5. If you compare M6 to M4 you can also see that running with backup indexers incurs a performance overhead due to the additional synchronization work it requires. Typically, an installation without backup indexers, such as M4, will outperform one with backup, such as M6.

The following diagram shows the average number of items each deployment processed per second during an incremental crawl.

Incremental crawl performance graph

Incremental crawls are faster than full crawls, from slightly faster up to a factor of 2 or 3. This is mainly because incremental crawls mostly consist of partial updates, which only updates metadata. This also implies that the crawling and indexing performance is largely the same for all content types.

For incremental crawls, the indexers were the bottleneck because the item processing load was limited. Typically, disk I/O capacity is the limiting factor. During an incremental update, FAST Search Server 2010 for SharePoint fetches the old version of the item from disk, modifies it, persist it to disk and then indexes it. This is more expensive than a full crawl operation where the item is only persisted and indexed.

Query performance

The following sub sections describe how the deployment configurations affect query performance.

The following diagram shows the query latency as a function of QPS for the M1 deployment.

M1 query performance (graph 1)

An idle indexer provided the best query performance, with an average latency less than 0.7 until approximately 21 QPS. During full crawl the limit was 10 QPS. During incremental crawl it was 15 QPS.

Note that the latency was not affected by higher QPS until we reached max system capacity. This occurred at the point where the curve starts bending "backward". The figure shows that QPS decreased and latency increased when we applied more query load after reaching the maximum capacity of the system. In the M1 deployment, the peak QPS was about 28 with idle indexers. The bottleneck in this deployment was the amount of CPU resources. The behavior is also illustrated in the next diagram where you can observe that performance decreased when we had more than 40 concurrent user agents and an idle indexer.

M1 query performance (graph 2)

Because the amount of CPU resources was the bottleneck, we tested the impact of using hyper-threading in the CPU. By enabling hyper-threading we allowed for more threads to execute in (near) parallel, at the expense of slightly reduced average performance for single-threaded tasks. Note that the query matching components ran in a single thread when QPS was low and no other tasks were running on the server.

M1 query performance (graph 3)

Hyper-threading performed better for all the three crawling and indexing cases in the M1 deployment. In the deployments with dedicated search rows, we observed a small reduction (around 150ms) in query latency when they were running at very light query load. In general, hyper-threading reduces query latency and allows for higher QPS, especially when having multiple components on the same server. Disabling hyper-threading only provides a small improvement under conditions where the performance already is good.

The following diagram shows the query latency as a function of QPS for the M2 deployment.

M2 query performance (graph 1)

Note that the latency was not affected by higher QPS until we reached max system capacity. When the indexers were idle, there was a slow latency increase until the deployment reached the saturation point at approximately 20 QPS. For full and incremental crawls the latency increased as indicated in the graph. The test did not include data to indicate exactly when the query latency saturation occurred during the crawl.

The following diagram shows the same test data as user agents versus latency.

M2 query performance (graph 2)

The next diagram shows the query latency as a function of QPS for M1 versus M2.

M2 query performance (graph 3)

The main conclusion is that M1 performed somewhat better than M2. M1 was able to handle about 3 QPS more than M2 before reaching the saturation point. The SAN disks we used for M2 should have been able to match M1’s locally attached disks with regards to I/O operations per second. However, the bandwidth towards the disks was somewhat lower with the SAN configuration. For full crawl and incremental crawl the performance was comparable during the light load tests. During heavy load, ongoing indexing had less impact on M2 as the SAN provided more disk spindles to distribute the load.

The following diagram shows the query latency as a function of QPS for M3 versus M1.

M3 query performance graph

The M3 deployment was able to handle about 10 QPS with idle indexers. One characteristic of a stand-alone server deployment is that the query latency fluctuates more when you move close to the saturation point. Under low query load, M3 was almost able to match the performance of M1, but during increased load the limitations became apparent. M1 had three times the number of query matching components and the peak QPS capacity was close to three times as high, 28 versus 10 QPS.

M4 was an M1 deployment with an additional dedicated search row. The main benefit of a dedicated search row is that the index and the search processes are not directly competing for the same resources, primarily disk and CPU.

The following diagrams show that the added search row in M4 provided a 5 QPS gain versus M1. In addition, the query latency was improved by about 0.2-0.4 seconds.

M4 query performance (graph 1) M4 query performance (graph 2)

Adding search rows will in most cases improve query performance, but at the same time, it introduces additional network traffic. The query performance may be decreased if you add search rows without sufficient network capacity. The additional network traffic is generated by the indexers copying large index files to the additional query matching servers. The index file copying may also affect indexing performance.

With the M4 deployment, we also tested the impact of document volume. The following diagram shows the query latency as a function of QPS for M4 with 18, 28 and 43 million documents indexed.

M4 query performance (graph 3)

The graph shows that the document volume affects the maximum QPS the system can deliver. Adding ~10 million documents gave a ~5 max QPS reduction. Below 23 QPS the document volume had low impact on the query latency.

The following diagram shows user agents versus query latency for M5 and M3.

M5 query performance

As illustrated when we compared M1 and M4, adding a dedicated search row improves query performance. Comparing M3 and M5 shows that the same is the case when you are adding a query matching server to a stand-alone deployment.

The following diagram shows the query latency as a function of QPS for M6 versus M4.

M6 query performance graph

The difference between M6 and M4 was the addition of a backup indexer row. In general, backup indexers will compete with query matching for available resources, and may decrease query performance. However, in this specific test, that was not the case. The reason was that the hardware we used had enough resources to handle the additional load during regular operations. Note that the backup indexers use significantly less resources than the primary indexer. This is because it is the primary indexer that performs the actual indexing and distributes the indexes to the search rows and backup indexer row.

Note

All indexers perform regular optimization tasks of internal data structures between 03.00 AM and 05.59 AM every night. These tasks may, depending on the crawling and indexing pattern, be I/O intensive. Testing of M6 showed that you may see a significant reduction in query performance during indexer optimization processes. The more update and delete operations the indexer handles, the more optimization is required.

The following diagram shows the query latency as a function of QPS for M7 compared to M5.

M7 query performance graph

M7 was similar to M5, but it was running on servers that had more powerful CPUs and more memory. On the other hand, it had a disk subsystem that could not handle the same amount of I/O operations per second (IOPS); the M7 storage subsystem had more bulk capacity but less performance compared to the M5 storage subsystem.

The main difference in results compared to M5 was a slightly increased QPS rate before the system became saturated. M5 was saturated around 10 QPS, whereas M7 provided approximately 12 QPS. This was due to the increased CPU performance and added memory, although partly counterbalanced by the weaker disk configuration.

The M8 deployment used solid state storage with much higher IOPS and throughput capabilities. This deployment was only limited by the available CPU processing power. A more powerful CPU configuration, for example with quad CPU sockets, should be able to achieve linearly performance improvements with the added CPU resources.

Query performance results for M8 are discussed together with M9.

The M9 deployment had improved CPU performance (X5670) and high-end disk spindles (15k RPM SAS). M9 is an example of the achievable performance gains by using high-end components, keeping regular disk spindles for storage.

The improved CPU performance implied 20-30% increased crawl speeds for M9 over M8 (both with twenty item processing components), and even more compared M3 (which had twelve item processing components). Note that M3 ran with a less powerful server for the content sources, and was more often limited by the sources than M8 and M9. M9 achieved more than 50 documents per second for all content sources.

The following diagram shows the query latency as a function of QPS for M8 and M9 under varying load patterns, compared to M3 and M5:

M9 query performance graph

Both M8 and M9 performed better during idle crawling and indexing than M5. M5 performance is only shown during crawling and indexing. Because M5 had a dedicated search row, the query performance was fairly constant regardless of ongoing crawling and indexing. The main contribution to peak query rate improvements were the additional CPU resources M8 and M9 had compared to M5.

During idle crawling and indexing, M9 had slightly better QPS than M8 due to the more powerful CPU. Under overload conditions (more than 1 second latency), M9 decreased in performance because of an overloaded storage subsystem (exactly like M5), while M8 sustained the peak rate with its solid state storage.

M3, which is a M5 deployment without the search row, saturated already at 5 QPS. M9 provided higher QPS rates. M9 had higher latency than M3 at low QPS during crawling and indexing. The reason for this was that M9 had more than 50% higher crawl rates than M3 because the deployment had more item processing components and faster CPU. If we reduced the crawl rates to M3 levels, M9 would have given better query performance than M3 also at low QPS.

During crawling and indexing, M8 query performance decreased less than 20%, dominantly because of CPU congestion. This means that the storage subsystem on M8 made it possible to maintain good query performance during crawling and indexing without doubling the hardware footprint with a search row. Adding more CPU resources would have allowed for better query performance, as the storage subsystem still had spare resources in the given deployment.

On M9, query latency approximately doubled during crawling and indexing. The deployment could still deliver acceptable performance under low QPS loads, but was much more affected than M8. This was due to the storage subsystem on M9 having slower read accesses when we combined it with write traffic from crawling and indexing.

M10

The M10 deployment had the same configuration as M1 and M2 but used solid state storage. The deployment had the same amount of storage as M8, but it was spread across three servers to obtain more CPU power.

It is most interesting to compare M10 to M4, as we in both these deployments tried to achieve a combination of high crawl rate and query performance at the same time. In M4, we did this by splitting the application storage across two search rows, each with three columns with 18 SAS disk per server. M10 only had a single row and was replacing the application disk spindles with solid state storage. The search server farm totals were:

M4: 6 servers, 108 disk spindles
M10: 3 servers, 3 solid state storage cards

Both deployments had an additional administration server.

The following diagram shows the query latency as a function of QPS for M10 versus M4:

M10 query performance graph

With idle crawling (solid lines), M4 achieved around 23 QPS before degrading to around 20 QPS under overload conditions. IO became the bottleneck. M10 was able to deliver 30 QPS at which point it became limited by the throughput CPU. By using faster or more CPUs we could have increased this benefit even more than the measured 30% gain.

During crawling, M4 had no significant changes in query performance compared to idle. This was because of its dedicated search row. M10 had some degradation in query performance because content processing and queries competed for the same CPU resources. But notice that M10 still achieved the same 20 QPS as M4 under the highest load conditions. Also notice that the crawling rate for M10 is 20% higher than for M4. The reason for the latter was that the increased IO performance allowed for better handling of concurrent operations.

Disk usage

The following table shows the combined increase in disk usage on all servers after the various content sources were indexed.

Content source	Raw source data size	FiXML data size	Index data size	Other data size
File share 1 (2 copies)	154 GB	18 GB	36 GB	5 GB
File share 2 (2 copies)	6.7 TB	360 GB	944 GB	10 GB
SharePoint 1	2.0 TB	70 GB	220 GB	13 GB
SharePoint 2	2.0 TB	66 GB	220 GB	17 GB
HTML 1	8.8 GB	8 GB	20 GB	8 GB
HTML 2	137 GB	31 GB	112 GB	6 GB
Total	11 TB	553 GB	1.6 TB	56 GB

The following table shows the disk usage for the web analyzer component in a mixed content scenario, where the data is both file share content and SharePoint items.

Number of items in index	40,667,601
Number of analyzed hyperlinks	119,672,298
Average number of hyperlinks per items	2.52
Peak disk usage during analysis (GB)	77.51
Disk usage between analysis (GB)	23.13
Disk usage per 1 million items during peak (GB)	1.63

Note

In this scenario, the disk usage values for the web analyzer component were somewhat lower than the recommended dimensioning values. The reason was that the URLs in the test dataset were fairly short.
The average number of links per item was low compared to pure web or SharePoint content sources. For example, in a pure web content installation the average number of links can be as high as 50. Since the web analyzer component only stores document IDs, hyperlinks and anchor texts, the number of links is the dominant factor determining the disk usage.

Share via

Test results: Medium scenario (FAST Search Server 2010 for SharePoint)

Test deployments

Test characteristics

Hardware/Software

Topology

Dataset

Test results

Crawling and indexing performance

Query performance

Disk usage

See Also

Concepts

Additional resources