Monitoring the FAST Search Web crawler log files

 

Applies to: FAST Search Server 2010

The main single server FAST Search Web crawler log is located at <FAST Search Server 2010 for SharePoint installation folder>\var\log\crawler\node\crawler.log. This log contains overall status information and exceptional conditions, for example when the FAST Search Web crawler starts/stops, when it starts/stops storage compaction and general feeding information. When debugging an issue, look in this log first.

The Browser Engine log is located at <FAST Search Server 2010 for SharePoint installation folder>\var\log\browserengine\BrowserEngine.log. This log contains overall status messages and URL processing information.

Additional FAST Search Web crawler logs, which provide per crawl collection information, are located in the <FAST Search Server 2010 for SharePoint installation folder>\var\log\crawler\node\ folder. Each folder represents a particular log type, and contains a set of folders that represent a particular crawl collection. The number of log folders depends on what the crawl collection is configured to report. A collection can contain these log folders:

Log type Description

dns

Contains log files from DNS resolutions.

dsfeed

Contains log files of the status of each URL submitted to Web item processing. Deletes are also logged. This log is useful to find out why a Web item was not indexed even though it was crawled.

fetch

Contains log files for every URL fetched by the Web crawler.

header

Contains logs of HTTP request/response exchanges, separated into directories by Web site name. The header log is disabled by default. This log can be useful for analyzing a forms login session.

PP

Contains log files from Post Process. Post Process performs duplicate detection of downloaded Web items, and process content. The Post Process log contains the URLs and referrer URL to every unique Web item together with their size, MIME type and URLs to any duplicates found.

screened

Contains log files of all URLs processed by the Web crawler and details for any given URL on whether or not it will be crawled. This log is used to find out why a Web item was excluded from a crawl.

site

Contains log files crawler Web site events. The site log contains information on when the Web crawler starts/stops to crawl the Web site, when it refreshes the Web site or when it becomes idle on the Web site.

In a FAST Search Web crawler multi-node deployment, the duplicate server logs are located at <FAST Search Server 2010 for SharePoint installation folder>\var\log\crawler\node\dupserver.log. This log contains the overall status information and exceptional conditions of the duplicate server.

On the multi-node scheduler, FAST Search Web crawler log is located at <FAST Search Server 2010 for SharePoint installation folder>\var\log\crawler\multinode.log. On the node schedulers, the log is called var\log\crawler\node.log.