Manage crawler impact rules (Office SharePoint Server Central Administration Help)
Applies To: Office SharePoint Server 2007
A crawler impact rule defines the rate at which the Windows SharePoint Services Help Search service requests documents from a Web site during crawling. The rate can be defined either as the number of simultaneous documents requested or as the delay between requests. In the absence of a crawler impact rule, the number of documents requested is from 5 through 16 depending on the hardware resources.
You can use crawler impact rules to modify loads placed on sites when you crawl them.
Site name expressions are evaluated in order. Typically, you should list the crawler impact rules in most-specific-to-most-general order, because the first matching rule is applied. For example, * must always be the last rule in the list; otherwise, any rules listed later will not apply. If you create a new rule while a crawl is in progress, the new rule is effective as soon as you save it: You do not need to wait for the crawl to finish before the rule is effective (though content that is already crawled is not subject to the new rule).
To add, edit, delete, or reorder crawler impact rules, you must first open the Crawler Impact Rules page:
On the top navigation bar, click Application Management.
On the Application Management page, in the Search section, click Manage search service.
On the Manage Search Service page, in the Farm-Level Search Settings section, click Crawler impact rules.
What do you want to do?
Add a crawler impact rule
On the Crawler Impact Rules page, click Add Rule.
On the Add Crawler Impact Rule page, in the Site box in the Site section, type the URL of the site but exclude the protocol (for example, do not include http://). The following table shows the wildcard characters that you can use in the site name when adding a rule.
* as the site name
Apply the rule to all sites.
*.* as the site name
Apply the rule to sites with dots in the name.
*.site_name.com as the site name
Apply the rule to all sites in the site_name.com domain (for example, *.adventure-works.com).
*.top-level_domain_name (such as *.com or *.net) as the site name
Apply the rule to all sites that end with a specific top-level domain name (for example, .com or .net).
Replace a single character in a rule. For example, *.adventure-works?.com will apply to all sites in the domains adventure-works1.com, adventure-works2.com, and so on.
You can create a crawler impact rule for *.com that applies to all Internet sites whose addresses end in .com. For example, an administrator of a portal might add a content source for samples.microsoft.com. The rule for *.com applies to this site unless you add a crawler impact rule specifically for samples.microsoft.com.
In the Request Frequency section, select one of the following options:
Request up to the specified number of documents at a time and do not wait between requests. You can specify the maximum number of requests that the Windows SharePoint Services Help Search service can make at one time to the site. On the Simultaneous requests menu, click the number of simultaneous requests to perform.
Request one document at a time and wait the specified time between requests. You can specify a delay between requests. The search service makes one request per site at one time and then it waits for the specified amount of time before making the next request. In the Time to wait (in seconds) box, type the time to wait between requests. The minimum time to wait between requests is one second, and the maximum time is 999 seconds.
If the request rate is too high, the search service can overload some Web sites with requests.
Edit a crawler impact rule
On the Crawler Impact Rules page, in the list of rules, click Edit on the menu of the rule that you want to edit.
The settings that you can edit are described in the Add a crawler impact rule section.
Delete a crawler impact rule
On the Crawler Impact Rules page, in the list of rules, click Delete on the menu of the rule that you want to delete.
Reorder crawler impact rules
On the Crawler Impact Rules page, in the list of rules, in the order column, select a value in the drop-down list that specifies the position you want the rule to occupy.
The rule that currently occupies that position is shifted down by one position, along with all the rules below it.