Add a crawler impact rule (Office SharePoint Server 2007)

Applies To: Office SharePoint Server 2007

Updated: 2008-09-11

This article explains how to add a crawler impact rule for crawling a site. Before you perform this procedure, confirm that:

When you add a crawler impact rule, you specify one of the following restrictions for crawling a specified site:

  • The maximum number of documents that the crawler can request at a time from the site.

  • The frequency with which the crawler can request a document from the site.

In this article:

Adding a crawler impact rule

Use this procedure to add a crawler impact rule.

To add a crawler impact rule

  1. Complete one of the following steps depending on the status of your installation.

    • If the Infrastructure Update for Microsoft Office Servers is installed, in Central Administration, on the Quick Launch, in the Shared Services Administration group, click a shared service.

      On the Shared Services Administration page, in the Search section, click Search administration.

      On the Search Administration page, in the Crawling section, click Crawler impact rules.

      Note

      For more information, see Description of the Microsoft Office Servers Infrastructure Update (https://go.microsoft.com/fwlink/?LinkID=121886).

    • If the Infrastructure Update for Microsoft Office Servers is not installed, in Central Administration, on the Application Management tab, in the Search section, click Manage search service.

      On the Manage Search Service page, in the Farm-Level Search Settings section, click Crawler impact rules.

  2. On the Crawler Impact Rules page, click Add Rule.

  3. On the Add Crawler Impact Rule page, in the Site section, in the Site box, type the site name that will be associated with this crawler impact rule. For information about using wildcard characters in the site name, see Using wildcard characters in site names.

    Note

    When typing the URL, you must exclude the protocol. For example, do not include http:// or file://.

  4. In the Request Frequency section, select one of the following options:

    • Request up to the specified number of documents at a time and do not wait between requests. If you choose this option, use the Simultaneous requests list to select how many documents you want the crawler to request at one time when crawling this URL. You can specify the maximum number of requests that the Office SharePoint Services Search service can make at one time when crawling this URL.

    • Request one document at a time and wait the specified time between requests. You can specify a delay (in seconds) between requests, when crawling this URL. When this option is selected, the Office SharePoint Server Search service makes one request per site at one time, and then it waits for the specified amount of time before making the next request. In the Time to wait (in seconds) box, type the time (in seconds) to wait between requests. The minimum time to wait between requests is one second, and the maximum time is 1,000 seconds.

  5. Click OK.

Using wildcard characters in site names

A crawler impact rule can specify a single site, or you can use wildcard characters so that the rule applies to multiple sites. The following table shows the wildcard characters that you can use in the site name when you add a crawler impact rule.

Use To

* as the site name

Apply the rule to all sites.

*.* as the site name

Apply the rule to sites with dots in the name.

*.sitename.com as the site name

Apply the rule to all sites in the sitename.com domain (for example, *.adventure-works.com).

*.top-level_domain_name (such as *.com or *.net) as the site name

Apply the rule to all sites that end with a specific top-level domain name (for example, .com or .net).

?

Signify a single character in a site name. For example, *.adventure-works?.com applies to all sites in the domains adventure-works1.com, adventure-works2.com, and so on.

For example, an administrator might create a crawler impact rule for *.com that applies to all Internet sites with addresses that end in .com. The administrator of a portal might add a content source for samples.microsoft.com. The rule for *.com applies to this site unless there is a crawler impact rule specifically for samples.microsoft.com.

See Also

Concepts

Edit a crawler impact rule (Office SharePoint Server 2007)