Background on the use of enterprise search at Microsoft

 

Applies to: Microsoft FAST Search Server 2010 for SharePoint, Microsoft Office SharePoint Server 2007

This is the third of 12 articles that compose How Microsoft IT deployed FAST Search Server 2010 for SharePoint (white paper). In this article:

  • The need for enterprise search at Microsoft

  • The MSIT implementation of enterprise search that used SharePoint Server 2007

  • Requirements for an improved search solution

  • The solution: FAST Search Server 2010 for SharePoint

The need for enterprise search at Microsoft

Every day, thousands of Microsoft employees, contractors, and vendors create and store large amounts of digital information on servers in many locations across the worldwide Microsoft corporate enterprise. Content is in the form of Microsoft Office documents, graphics, videos, and other formats. Because Microsoft workers use the collaborative features of SharePoint Server extensively, MSIT must manage more than 34 terabytes of content on SharePoint intranet sites. This includes content such as SharePoint Server libraries, lists, blogs, wikis, and My Sites.

MSIT hosts and manages all intranet sites that consume the enterprise search service. The following table provides examples of these sites.

Table 1. Examples of sites that use the enterprise Search service

Site or sites Description

MSW

MSW is one of the primary enterprise portals at Microsoft. It provides daily news, executive updates, and other information. In addition, employees frequently go to this site to get to other intranet sites, such as sites for human resources, legal affairs, and technical research. The MSW farm also hosts the enterprise Search Center.

Division-level portals

Examples of division-level portals that use the enterprise search service include the following:

  • ITWeb, the portal for technical support

  • FinWeb, the finance portal

  • LCAWeb, the portal for the Legal and Corporate Affairs group

  • Infopedia, the portal for sales and marketing information, which provides content related to engaging with customers and partners

  • Office, the portal for employees in the Microsoft Office Division

Department-level portals

Employees can use these portals to create collaboration or portal sites that host business-critical information.

Team sites

Employees can use these portals to create team sites for collaboration. On team sites, employees can also use the latest SharePoint features and test proof-of-concept solutions.

My Sites

Each employee has a My Site to store business and personal information and documents. Employees can restrict access to items on their My Site.

However, many other groups and individuals in the company create and host their own sites. MSIT does not permit these sites to consume the enterprise search service, but the site owners can ask MSIT to configure the enterprise search service to crawl their sites. Other important corporate data repositories include file shares and structured data sources in line-of-business applications. MSIT must provide an enterprise search solution that enables the members of the Microsoft work force to quickly and easily find up-to-date, relevant information in all of these large and diverse repositories.

The MSIT implementation of enterprise search that used SharePoint Server 2007

MSIT has used SharePoint technologies to provide enterprise search capability to Microsoft employees for a number of years. MSIT used Microsoft Office SharePoint Portal Server 2003 to provide enterprise search from 2003 until late 2006, at which point that solution no longer scaled to company requirements. MSIT then upgraded to an enterprise search solution that used Microsoft Office SharePoint Server 2007.

As with the previous solution, in the Office SharePoint Server 2007 solution, MSIT maintained three separate Shared Services Providers (SSPs), each in a separate Microsoft data center. One SSP was in Singapore, the second SSP was in Dublin, Ireland, and the third SSP was in Redmond, Washington. The search service in the Dublin SSP crawled only content in the Europe, Middle East, and Africa (EMEA) region (and therefore provided search results for that region only), while the search service in the Singapore SSP crawled content in the Asia-Pacific region (and therefore provided search results for that region only). For a time, however, the Redmond SSP search service crawled content in all three regions and provided worldwide search results that all users could access by going to the enterprise Search Center.

Figure 1 indicates the region or regions that the search service of each SSP crawled in the initial state of the SharePoint Server 2007 search solution.

Figure 1. Content crawled by each search service in the initial state of the SharePoint Server 2007 search solution

Initial state of SharePoint Server 2007 solution

MSIT took this regional approach to enterprise search because search services were not fully supported over the corporate wide area network (WAN). Therefore, each region required a separate server-farm deployment for search. The deployment that hosted the Redmond search service used one dedicated index server, three dedicated query servers, and two database servers. The EMEA and Asia-Pacific search deployments each used one dedicated index server, two servers configured with both the query server role and the web server role, and one database server. Thus, a total of 14 computers in three different data centers handled the worldwide enterprise search solution.

An advantage of this approach for users in the EMEA and Asia-Pacific regions was that search results were relatively fresh, because crawling and indexing were relatively fast on the regional scale. Query responses were also relatively fast in those regions for a similar reason.

The main disadvantage of this approach was that there were three separate search services for MSIT to manage, so there were three separate content indexes, one corresponding to each search service. By searching from a site in the Asia-Pacific region, users could get search results for sites in that region only. Similarly, by searching from a site in the EMEA region, users could get search results for sites in that region only. However, because the Redmond search service was crawling all three regions, users in any region could still get search results for content worldwide by searching from the enterprise Search Center.

By late 2009, the Redmond SSP search service was crawling and indexing more than 13 terabytes of data that corresponded to more than 40 million items. Incremental crawl times for some content sources had increased to seven days or more, so that search results were frequently not fresh. Moreover, query latency in the Redmond search service had increased to eight seconds or more in some cases. Therefore, MSIT needed to reduce the amount of content that the Redmond search service was crawling.

For this reason, MSIT configured the Redmond search service so that it crawled only the Americas region and no longer crawled the EMEA and Asia-Pacific regions. The search service in each region was then crawling only regional data. However, users became dissatisfied because they could no longer get search results for content worldwide by searching from the enterprise Search Center.

Figure 2 indicates the region that the search service of each SSP crawled in the final state of the SharePoint Server 2007 search solution.

Figure 2. Content crawled by each search service in the final state of the SharePoint Server 2007 search solution

Final state of SharePoint Server 2007 solution

To improve performance of the Redmond search service even more, MSIT removed two other content sources from that service: Team (the content source for crawling SharePoint team sites) and Office (the content source for crawling the site collection of the Microsoft Office Division). The drawback of this change was that the only way to get search results from either of those site collections after that was to search directly from that site collection.

Another issue was that users in various locations were sometimes reporting that they were not getting helpful search results. Search results were sometimes out of date or not ranked appropriately as to relevance. In some cases, the search service took a long time to report that there were no search results at all.

In summary, due to the increased number of users, the distribution of users in different geographical locations, and the growth in content, the MSIT deployment of SharePoint Server 2007 no longer provided a suitable enterprise search solution to Microsoft workers. Thus, the need arose for a more powerful enterprise search solution.

Requirements for an improved search solution

MSIT identified the following business, technical, and administrative requirements for an enterprise-level search service:

  • Centralized management. The search solution must be hosted and manageable in a single deployment instead of in deployments in multiple locations.

  • Completeness of content index. There must be a single content index for all crawled content. This includes all site content in the three geographic regions (Americas, Asia-Pacific, and EMEA), so that all users can get search results from the same set of content.

  • High capacity. The content index must be able to accommodate projected growth to 200 million items in 2013 or later.

  • Significantly reduced crawl times. Incremental crawls of all high-priority content sources must take less than 24 hours so that all search results are as fresh as possible for the most important content.

  • Significantly reduced query latency. Users must be able to get search results in less than two seconds.

  • Increased ability to find relevant content. Search results must be ranked according to relevance--for example, based on authority of sites and pertinence of metadata--so that users can quickly find information that they are looking for.

  • Uniform search experience. All users worldwide must have access to an enterprise search center that provides the same search experience.

  • Significantly increased user satisfaction. Users must be pleased with the increase in performance and effectiveness of the search system and with the overall search experience.

The solution: FAST Search Server 2010 for SharePoint

In 2008, Microsoft acquired FAST Search and Transfer, a leading enterprise search vendor. Microsoft merged search technologies from that acquisition with SharePoint search technologies to create a new product, FAST Search Server 2010 for SharePoint.

MSIT evaluated the search features and functionality of SharePoint Server 2010 and FAST Search Server 2010 for SharePoint to determine which product would best meet the enterprise search requirements. FAST Search Server 2010 for SharePoint satisfied all of the demands. MSIT adopted FAST Search Server 2010 for SharePoint primarily because it is highly scalable and can support high capacity and performance for a large enterprise deployment. FAST Search Server 2010 for SharePoint also provides many advanced options, such as the ability to tune the ranking of search results.

To view the white paper as a single article on TechNet, or to download it, see Improving enterprise search at Microsoft: How FAST Search Server 2010 for SharePoint Powers Worldwide Intranet Search at Microsoft (https://technet.microsoft.com/en-us/library/bb735129.aspx).