Gather information about the current search environment (SharePoint Server 2010)

SharePoint 2010

Applies to: SharePoint Server 2010

Topic Last Modified: 2010-11-02

An important step in planning the enterprise search solution is to gather information about the current environment, including the following types of information and reports:

  • Information about the organization

  • Information about the topology

  • Current settings for search

  • Performance and usage reports

You will need this information for planning the search topology, crawling and federation, people search, and the end-user search experience.

Gather the following information about the organization:

  • User, business, and functional requirements for the enterprise search solution, along with any service level agreements (SLAs). This information will help you to design and build the search solution and verify whether the solution meets the requirements during testing.

  • Contact information for existing farm administrators, search administrators, site collection administrators, site owners, and any other stakeholders for the enterprise search solution. This information will help you to plan the enterprise search team, and it also provides a contact list for any communications that occur during planning, deployment, and operations.

Gather the following information about the topology:

  • Current topology diagrams. You will refer to these while planning the topology and planning for people search.

  • Locations of content repositories that should be included in search results, including SharePoint sites, Web sites, file shares, Exchange public folders, business data sources, user profile stores, Lotus Notes, and external sites.

  • Locations of users.

If you are starting from a previous version of SharePoint products and technologies, gather the following information about current settings for search:

  • Default content access account

  • Content source settings, including the following settings for each content source:

    • Content source name

    • Content source type

    • Start addresses

    • Crawl settings

    • Full crawl schedule

    • Incremental crawl schedule

  • Crawler impact rules, including the following settings for each crawler impact rule:

    • Site (URL)

    • Request frequency

  • Crawl rules, including the following settings for each crawl rule:

    • Path

    • Crawl configuration (excluded or included items)

    • Content access account

  • Third-party or custom connectors (called protocol handlers in prior versions)

  • File types included in the file-type inclusions list, and whether they required an additional IFilter

  • File types removed from the file-type inclusions list

  • Languages for which word breakers and stemmers are installed

  • Farm-level search settings, including the following information:

    • Contact e-mail address

    • Proxy server settings (address, port, whether to bypass for local addresses, and addresses for which you do not want to use a proxy server)

    • Crawler time-out settings (connection time and request acknowledgement time)

    • SSL certificate warning configuration

  • Scope settings

  • Crawl settings

  • The following additional settings:

    • Federated locations

    • Server name mappings

    • Indexer performance settings

    • Crawled properties

    • Managed properties

    • Search result removal

    • Alerts

    • Keywords

    • Best Bets

    • Authoritative pages

Gather the following performance and usage data:

  • Performance metrics from search administration reports, if available. You will use this information when you plan the topology. For more information, see Use search administration reports (SharePoint Server 2010).

  • Usage metrics from Web analytics reports. You will use this information when you design the end-user experience for search.