Add a crawler impact rule (Search Server 2008)

Applies To: Microsoft Search Server 2008

 

Topic Last Modified: 2008-11-07

This article explains how to add a crawler impact rule for crawling a site. For information about the importance of using crawler impact rules, see Manage crawler impact (Search Server 2008).

Note

Unless otherwise noted, the information in this article applies to both Microsoft Search Server 2008 and Microsoft Search Server 2008 Express.

When you add a crawler impact rule, you specify one of the following restrictions for crawling a specified site:

  • The maximum number of documents that the crawler can request at a time from the site.

  • The frequency with which the crawler can request a document from the site.

In this article:

  • Adding a crawler impact rule

  • Using wildcard characters in site names

Adding a crawler impact rule

Use the following procedure to add a crawler impact rule.

Important

You must be a search services administrator to perform this procedure. For more information, see Add or remove a search services administrator (Search Server 2008).

To add a crawler impact rule

  1. On the Search Administration page, in the Crawling section, click Crawler impact rules.

  2. On the Crawler Impact Rules page, click Add Rule.

  3. On the Add Crawler Impact Rule page, in the Site section, in the Site box, type the site to associate with this crawler impact rule. For information about using wildcard characters in the site name, see Using wildcard characters in site names.

    Note

    When typing the site name, you must exclude the protocol. For example, do not include http:// or file://.

  4. In the Request Frequency section, select one of the following options:

    • Request up to the specified number of documents at a time and do not wait between requests. From the Simultaneous requests list, select the maximum number of documents you want the crawler to request at a time when it crawls the specified site.

    • Request one document at a time and wait the specified time between requests. In the Time to wait (in seconds) box, type the time (in seconds) to wait between requests. The minimum time is one second; the maximum time is 1,000 seconds.

  5. Click OK.

Using wildcard characters in site names

A crawler impact rule can specify a single site, or you can use wildcard characters so that the rule applies to multiple sites. The following table shows the wildcard characters that you can use in the site name when you add a crawler impact rule.

Use To

* as the site name

Apply the rule to all sites.

*.* as the site name

Apply the rule to sites with dots in the name.

*.sitename.com as the site name

Apply the rule to all sites in the sitename.com domain (for example, *.adventure-works.com).

*.top-level_domain_name (such as *.com or *.net) as the site name

Apply the rule to all sites that end with a specific top-level domain name (for example, .com or .net).

?

Signify a single character in a site name. For example, *.adventure-works?.com applies to all sites in the domains adventure-works1.com, adventure-works2.com, and so on.

For example, an administrator might create a crawler impact rule for *.com that applies to all Internet sites with addresses that end in .com. The administrator of a portal might add a content source for samples.microsoft.com. The rule for *.com applies to this site unless there is a crawler impact rule specifically for samples.microsoft.com.

See Also

Concepts

Edit a crawler impact rule (Search Server 2008)