Chapter 9 - Administering the Indexing Service

The Indexing Service is used to build catalogs of documents that can be searched. When you add this capability to a World Wide Web site, it allows users to search for topics of interest using a standard Hypertext Markup Language (HTML) form. Like Internet Information Services (IIS), the Indexing Service is integrated into the Microsoft Windows operation system and can be used on intranets, extranets, and the Internet. As the Web administrator, you set up the catalogs the Indexing Service needs, configure content indexing, and manage indexing on a day-to-day basis.

Managing the Indexing Service is much different from managing IIS. Before you can use the Indexing Service, you must do the following tasks:

  1. Install the Indexing Service on the site or virtual server you want to index. The Indexing Service is configured to start manually by default. You'll need to change this so that the Indexing Service starts automatically.

  2. Create a catalog of documents to be searched. Each catalog should associate with a specific Web site and can be optionally associated with a Network News Transfer Protocol (NNTP) virtual server as well.

  3. Specify the directories and files to be indexed. You specify content indexing options using the Internet Information Services snap-in.

  4. Create a search page on the Web site. This page is used to access the catalog and retrieve information that matches the user's search parameters. The search page must specify the physical location of the catalog using the CiCatalog variable. Other variables are available to configure index searching as well.

Once you install and configure the Indexing Service, the service automatically creates and updates indexes. The service also attempts to manage its catalogs so that the data they contain are consistent and current. Data within catalogs occasionally gets out of sync, and when this happens, you may need to rebuild the catalog or force the Indexing Service to rescan directories for documents that should be indexed. These and other administration tasks are covered in this chapter.

On This Page

Getting Started with the Indexing Service Core Indexing Service Administration Managing Catalogs

Getting Started with the Indexing Service

The Indexing Service extracts information from designated documents and organizes the results into a catalog that can be searched quickly and easily. The extracted information includes the content (text) within documents as well as document properties, such as the document title and author. To understand how the Indexing Service works, let's look at the following subjects:

  • How you can use and install the Indexing Service

  • How the Indexing Service builds indexes and catalogs

  • How you can search and manipulate indexes

Using the Indexing Service

The Indexing Service indexes the following types of documents:

  • HTML (.htm or .html)

  • ASCII text files (.txt)

  • Microsoft Word documents (.doc)

  • Microsoft Excel spreadsheets (.xls)

  • Microsoft PowerPoint presentations (.ppt)

  • Internet mail and news (when you index NNTP virtual servers)

Other documents for which a document filter is installed can be indexed as well. If the Indexing Service isn't installed on your Web server, you can install it using the Windows Components Wizard. To access and use this wizard, follow these steps:

  1. Log on to the computer using an account with administrator privileges.

  2. Click Start, point to Settings, and then click Control Panel.

  3. Double-click Add/Remove Programs. This displays the Add/Remove Programs dialog box.

  4. Start the Windows Components Wizard by clicking Add/Remove Windows Components.

  5. Select Indexing Service and then click Next to continue. The wizard will then install the Indexing Service.

Once you've installed the Indexing Service, you manage the service using the Indexing Service snap-in for the Microsoft Management Console (MMC) or the Indexing Service node in Computer Management. Regardless of the option you choose, you can work with both local and remote servers using the same techniques. The only task that is different is connecting to remote servers.

With the snap-in, you set the server you want to work with when you add the snap-in to a management console. Here are the steps for adding the Indexing Service snap-in to a management console and selecting a server to work with:

  1. Open the Run dialog box by clicking Start and then clicking Run.

  2. Type mmc in the Open field and then click OK. This opens the Microsoft Management Console (MMC).

  3. In MMC, click Console, and then click Add/Remove Snap-In. This opens the Add/Remove Snap-In dialog box.

  4. On the Standalone tab, click Add.

  5. In the Add Standalone Snap-In dialog box, click Indexing Service, and then click Add.

  6. Select Local Computer to connect to the computer on which the console is running. Or select Another Computer and then type the name of a remote computer.

  7. Click Finish. Afterward, click Close, and then click OK.

With Computer Management, you connect to the local server automatically when you start the utility. You can connect to a different computer by right-clicking the Computer Management node, selecting Connect To Another Computer, and then following the prompts. Figure 9-1 shows the Indexing Service node in Computer Management. As you can see, selecting the Index Service node displays an overview of the currently installed catalogs, which include the default System and Web catalogs. The catalog summary provides the following information:

  • Catalog The descriptive name set when the catalog was created

    Bb727099.iis0901(en-us,TechNet.10).gif

    Figure 9-1: Use the Indexing Service node in Computer Management to manage the Indexing Service.

  • Location The physical location of the catalog, such as D:\Catalogs\WWW\

  • Size (Mb) The size of the catalog in megabytes

    Note: The typical catalog is 25 percent to 40 percent of the total size of the documents indexed. This means that if you index 1 gigabyte (GB) of documents you'll need an additional 250 MB–400 MB of storage space for the associated catalog.

  • Total Docs The total number of documents designated for indexing in this catalog

  • Docs to Index The total number of documents that remain to be indexed

  • Deferred for Indexing The total number of documents that need to be indexed but cannot be indexed because they are in use

    Note: The Indexing Service defers indexing of documents being used and will attempt to index the documents when they are no longer in use.

  • Word Lists The number of word lists associated with the catalog and stored in system memory

  • Saved Indexes The number of saved indexes within the catalog

  • Status The status of the indexing process

If you access the Indexing Service using Computer Management, you'll find that two default catalogs were created when you installed the service. These catalogs are the following:

  • System The System catalog contains an index of all documents on all hard drives attached to the server.

  • Web The Web catalog contains an index of the default Web site.

    Tip I recommend deleting the System catalog. This catalog typically isn't used on an IIS server, and maintaining the catalog uses system resources that could be better used elsewhere.

You can create additional catalogs at any time. When you create a catalog, you can associate the catalog with a Web site and an NNTP virtual server. The service then uses the indexing settings on the directories associated with the site or virtual server to determine which documents should be indexed. You configure indexing settings on directories using the Internet Information Services snap-in.

Indexing Service Essentials

The Indexing Service stores catalog information in Unicode format. This allows the service to index and query content in multiple languages. The Indexing Service performs three main functions to process document contents:

  • Indexing Indexing is the process of extracting information from documents. The index contains contents from the main body of documents but doesn't include words on any exception word lists associated with the catalog. Indexes are compressed to save space.

  • Catalog building Catalog building is the process of storing the index information in a named location. Catalogs contain extracted content in the form of indexes and stored properties for a set of documents.

  • Merging Merging is the process of combining temporary indexes to create combined or master indexes. Merging indexes improves performance of the Indexing Service and reduces the amount of random access memory (RAM) used to store temporary indexes in memory.

Indexing and catalog building take place automatically in the background when the Indexing Service is running. When first started, the Indexing Service takes an inventory of the directories associated with each catalog to determine which documents should be indexed. This process is referred to as scanning. The Indexing Service can perform two types of scans:

  • Full

  • Incremental

Full scans take a complete look at all documents associated with a catalog. The Indexing Service performs a full scan under the following circumstances:

  • When the service is run for the first time after installation

  • When a folder is added to a catalog

  • As part of recovery if a serious error occurs

  • When you manually choose to do so

Incremental scans only look at documents modified since the last full or incremental scan. The Indexing Service performs incremental scans under the following circumstances:

  • When you start or restart the Indexing Service

  • When you change a local document

  • When the Indexing Service loses change notifications

  • Any time you manually start an incremental scan

    Note: File system change notifications are important parts of the incremental scanning process. Change notifications are generated by the operating system and read by the Indexing Server whenever local documents are modified. In most cases, change notifications for documents on remote systems will not reach the local Indexing Service. To account for this, the Indexing Service periodically performs incremental scans on any remote directories associated with a catalog.

After completing a scan of documents to be indexed, the Indexing Service begins to build the necessary catalogs. It does this by reading each document using a document filter. Filters are software components that interpret the structure of a particular kind of document, such as an ASCII text file, a Word document, or HTML document. Using the appropriate filter, the Indexing Service extracts the document contents and property values, storing the property values and the path to the document in the index. Next, the Indexing Service uses the filter to determine the language the document is written in and breaks the document body (content) into individual words. Each supported language has an exception list that provides a list of words that the Indexing Service should ignore.

You'll find exception lists in the \%SystemRoot%\System32 directory. These files are stored as ASCII text files and are named Noise.lang, where lang is a three-letter extension that indicates the language of the exception list. You can add entries to or remove entries from the exception list using a standard text editor or word processor.

The Indexing Service also stores values of selected document properties in the property cache. The property cache is a storage place for values of properties that you may want to search on or display in the list of search results. Within the property cache, there are two storage levels: primary and secondary. The primary storage level is for values that are frequently accessed, and, as such, these values are stored in a way that makes them quick and easy to retrieve. The secondary storage level is for additional values that are used infrequently.

After discarding words on the exception list and updating the property cache, the Indexing Service stores the remaining document content in a word list. Each document can have one or more word lists associated with it. Word lists are combined to form temporary indexes called shadow indexes. Shadow indexes are stored on disk in a compressed file format. Multiple shadow indexes can be, and usually are, in the catalog at any given time. Over time, the number of shadow indexes can grow substantially. This occurs as documents are added to and modified within indexed directories.

The Indexing Service uses a process called shadow merge to combine word lists and temporary indexes, thereby reducing the number of temporary resources used and improving the overall responsiveness of the service. Shadow merges occur during scans and as part of the normal housekeeping process implemented by the Indexing Service. The key events that trigger a shadow merge are when there are too many word lists stored in memory (20 by default) or when the total size of all word lists exceeds a preset value (256 KB by default).

The end result of the indexing process is a master index. Each catalog has one, and only one, master index. The master index is created the first time you create a catalog and is kept up to date by periodically merging it with shadow indexes to create a new master index. This process of merging shadow indexes with the master index is called master merge. Once a master merge has occurred, there will be only one saved index associated with a catalog. This index is the master index.

Master merges are triggered automatically based on the size of the shadow indexes, the amount of free disk space on the catalog drive, and the number of document changes in indexed directories. Automatic master merges, regardless of condition, are scheduled to occur nightly at 12:00 midnight as well. If necessary, you can force a master merge. The key reason for forcing a master merge is to force the Indexing Service to update a catalog so that all changes are reflected in search results immediately. As you might imagine, the master merge process is resource-intensive, so you normally wouldn't force a master merge during peak usage hours.

Settings that control scanning, merging, and other Indexing Service processes are found in the Registry and are stored here:

HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Control\ContentIndex

Registry settings that control scanning and merging include the following:

  • MasterMergeCheckpointInterval Sets the interval for determining whether a master merge should be performed. The default value is 2048 seconds.

  • MasterMergeTime Sets the default time for when a daily master merge should be performed. The default value is 0, meaning zero seconds after the start of a new day.

  • MaxFilesizeFiltered Sets the maximum size of filtered content for a particular document. By default, this is set to 256 KB.

  • MaxFreshCount Sets the maximum number of document updates and changes that triggers a master merge. By default, if more than 20,000 documents are changed, a master merge is triggered.

  • MaxIndexes Sets the maximum number of indexes that should be associated with a catalog before shadow merging is forced. By default, if more than 25 indexes are associated with a catalog, the Indexing Service will perform a shadow merge.

  • MaxShadowIndexSize Sets a maximum size value for shadow indexes in 128 KB increments. Used with MinDiskFreeForceMerge to force master merges when disk space is low and the size of the shadow index exceeds this value. The default is 15 (15 * 128 KB = 1920 KB).

  • MaxWordLists Sets the maximum number of word lists that can exist in a catalog. When this number is exceeded, a shadow merge is triggered. By default, this value is set to 20.

  • MaxWordlistSize Sets the maximum size of all word lists associated with a catalog. This value is set in increments of 128 KB and when exceeded, a shadow merge is triggered. By default, this value is set to 20 (20 * 128 KB = 2560 KB).

  • MinDiskFreeForceMerge Sets a minimum free disk space value. If a drive containing catalogs has less disk space than this value, and the total size used by shadow indexes exceeds MaxShadowIndexSize, the Indexing Service performs a master merge. The default is 15 MB.

  • MinSizeMergeWordlists Sets the minimum size threshold for merging word lists with a shadow index. If the word lists' size exceeds this value, a shadow merge is triggered. The default is 256 KB.

Searching Catalogs

Searching is the process of looking through the catalog to find information. Users can search the catalog in several different ways. The technique most used with Web servers is to build a query form that can be used to search the catalog. The Indexing Service includes a query form for each catalog that can be used to test the installation. You can also create query forms using Active Server Pages (ASPs) and Internet data query (IDQ) files.

With ASP, you create the query form and handle the results using a combination of server-side scripts that use ASP objects, HTML, and client-side scripts. The scripts you use can be written in any installed scripting language, and both Microsoft VBScript or Microsoft JScript are installed by default. Typically, you'll use the same page to implement the query form and display the results once the user has entered search parameters. For example, you could create a page called QUERY.ASP that implements the query form and has an embedded script that submits the search parameters and then formats the search results.

IDQ, on the other hand, is a special language designed for submitting queries to the Indexing Service. With IDQ, you create separate pages for handling each step in the query process. You use the following elements:

  • An HTML page that ends with the .htm or .html extension to implement the query form

  • An IDQ page that ends with the .idq extension to define the fixed query parameters for searches

  • An HTML extension file that ends with the .htx extension to format the results of the query

An advantage of IDQ over ASP is that IDQ queries are much faster and more efficient in their use of Indexing Service resources. Regardless of whether you use ASP or IDQ to handle searches, you must set basic parameters that provide default values for the Indexing Service. The parameters you should set are summarized in Table 9-1.

Note: Most organizations have Web developers whose job it is to create the Web pages needed for searching, handing, and displaying results. As the Web administrator, you assist the development team in setting parameters and publishing the Web pages when they are completed.

Table 9-1 Basic Parameters for the Indexing Service

Parameter

Description

Sample Value for IDQ

CiCatalog

Sets the file location of the catalog to be searched. If you don't set this parameter, the Indexing Service searches the Inetpub directory for a default catalog.

CiCatalog = D:\Catalogs\WWW

CiFlags

Sets the search flags for the query. The DEEP flag tells the Indexing Service to search all subdirectories within the current scope.

CiFlags = DEEP

CiMaxRecordsIn ResultSet

Sets the maximum number of records to return in the result set.

CiMaxRecordsInResultSet = 100

CiMaxRecords PerPage

Sets the maximum number of records to return in a single page.

CiMaxRecordsPerPage = 20

CiRestriction

Stores the search values entered by the user as passed from the query form.

CiRestriction = %CiRestriction%

CiScope

Sets the scope of the query within the catalog. If scope is set to /, the search begins at the top (or root) of the document tree.

CiScope = /Docs

Core Indexing Service Administration

Now that you know how the Indexing Service works, let's look at the core techniques for managing the Indexing Service. In this section, you learn how to specify the resources to index, create catalogs, tune performance, and more.

Setting Web Resources to Index

You configure Web resources for indexing using the Internet Information Services snap-in. Indexing settings can be applied globally or locally. Global settings affect all IIS Web sites that inherit the settings, which means that all indexable files on all sites and in all subdirectories use this setting. To configure global indexing settings, follow these steps:

  1. In the Internet Information Services snap-in, right-click the computer node for the IIS server you want to work with, and then select Properties.

  2. Click Edit on the Master Properties panel and then select the Home Directory tab.

  3. Enable indexing for all Web sites on the server by selecting Index This Resource and then clicking OK. The indexing settings are inherited by all Web sites automatically. The changes also are automatically propagated to all directories within sites.

  4. Disable indexing for all Web sites on the server by clearing Index This Resource and then clicking OK. Before applying these settings, IIS checks the existing settings in use for all Web sites. If a Web site uses a different value, an Inheritance Overrides dialog box is displayed. Use this dialog box to select the sites that should use the new setting, and then click OK.

Local settings can be applied to individual Web sites and directories. With sites, the root folder and all associated directories automatically inherit the site's indexing settings, meaning the indexable files within the root folder and associated directories use this setting. With directories, the selected directory and its subdirectories inherit the directory's indexing settings, meaning the indexable files within the selected directory and subdirectories use this setting.

To configure indexing settings for individual sites or directories, follow these steps:

  1. In the Internet Information Services snap-in, right-click the Web site or directory you want to manage, and then select Properties.

  2. Select the Home Directory, Directory, or Virtual Directory tab as appropriate.

  3. Enable indexing for the currently selected resource and all its subdirectories by selecting Index This Resource and then clicking OK. The indexing settings are inherited automatically.

  4. Disable indexing for the currently selected resource and all its subdirectories by clearing Index This Resource and then clicking OK. The indexing settings are inherited automatically.

Viewing and Creating Catalogs

Catalogs are created and managed at the site level. Each site that you want to index should have a catalog. A site can have multiple catalogs. For example, you could create a catalog for indexes of your product directories and another catalog for indexes of your services directories.

Each catalog you create should be created on a local file system and stored in a separate folder from other catalogs. To help manage multiple catalogs, you could create a top-level directory called Catalogs and then create subdirectories within this directory for each catalog you want to create. The catalog directory must be created before you create the catalog.

You create a catalog for a site by completing the following steps:

  1. Start Computer Management, and then expand the Services And Applications node by clicking the plus sign (+) next to it.

    Note: When first accessed, Computer Management automatically connects to the local system. You can connect to a different computer by right-clicking the Computer Management node, selecting Connect To Another Computer, and then following the prompts. Keep in mind that you cannot add a catalog to a remote computer if the default administration shares on the remote computer have been removed.

  2. Right-click the Indexing Service node, point to New, and then click Catalog. This displays the Add Catalog dialog box shown in Figure 9-2.

    Figure 9-2: Use the Add Catalog dialog box to create a new catalog on the server.

    Figure 9-2: Use the Add Catalog dialog box to create a new catalog on the server.

  3. In the Name field, type the name of the catalog.

  4. In the Location field, type the complete file path to the catalog folder, or click Browse and then select the folder in which you want the catalog to be located.

  5. Click OK. After you create a catalog, you must stop and then restart the Indexing Service to populate the catalog with indexes.

Viewing Indexing Status

Indexing should be periodically monitored to make sure catalogs are being maintained. One of the values you can use to keep track of the Indexing Service is the indexing status. As Table 9-2 shows, the indexing status tells you the current state of the indexing engine. If users are experiencing problems retrieving search results, the Indexing Service may be paused or stopped, a merge may be in progress, or the service may be rescanning catalogs. Typically, you'll see an indexing state followed by the keyword Started. The Started keyword is a reference to the state of the Indexing Service itself. In this case, the service is active.

Table 9-2 Quick Reference for Indexing Service Status Conditions

Status

Description

(Blank)

Indexing Service is stopped and must be started to resume indexing.

Indexing Paused (High I/O)

Indexing is paused due to a high level of input/output (I/O) activity. You may want to close some applications to reduce the I/O activity.

Indexing Paused (Low Memory)

Indexing is paused because of low virtual memory. You may want to close some applications to make more memory available.

Indexing Paused (Power Management)

Indexing is paused to save battery power. Typically only seen on laptop systems.

Indexing Paused (User Active)

Indexing is paused to minimize interference with user activity. Users may be working with a large number of files that the indexer needs, or an administrator may be making changes to the Indexing Service configuration in Computer Management.

Master Merge (Paused)

Master merge is paused because of low resource availability. You may have a problem with the amount of memory, file space, or throughput of the system.

Merge

A merge is in progress. Merging is resource-intensive and may cause a temporary performance problem on the system.

Query Only

Indexing Service is started and is available only for querying.

Recovering

Indexing Service is recovering the catalog from an abrupt shutdown.

Scan Required

One or more documents have been added or modified within directories of this catalog. The indexer should perform a scan automatically. If it doesn't, check the Windows Event log.

Scanning

One or more directories are being scanned for newly added or modified documents.

Scanning (NTFS)

One or more NTFS volumes are being scanned for new or modified documents.

Started

Indexing Service for this catalog has started.

Starting

Indexing Service is in the process of starting.

Stopped

Indexing of the catalog has been stopped.

You can view Indexing Service status conditions by completing the following steps:

  1. Start Computer Management, and then expand the Services And Applications node by clicking the plus sign (+) next to it.

  2. Select the Indexing Service node in the left pane. The right pane displays the status conditions for each individual catalog. Keep in mind each catalog can have a different status condition.

Starting, Stopping, and Pausing the Indexing Service

The Indexing Service can be started, stopped, and paused like any other service. Users can perform queries and obtain results only when the Indexing Service is running. Users will not be able to obtain query results when the Indexing Service is stopped or paused.

You can manage the Indexing Service by completing the following steps:

  1. Start Computer Management, and then expand the Services And Applications node by clicking the plus sign (+) next to it.

  2. Select the Indexing Service node in the left pane. The right pane displays the status conditions for each individual catalog.

    Right-click the Indexing Service node in the left pane. You can now do the following:

    • Select Start to start the Indexing Service.

    • Select Stop to stop the Indexing Service.

    • Select Pause to pause the Indexing Service. After you pause indexing, click Start to resume normal operations.

    Note: Whenever you stop and then restart indexing, the Indexing Service performs an incremental scan of all the catalogs associated with all the sites on the server.

Setting Indexing Service Properties

The Indexing Service has several properties that can be configured to customize the way indexing works. These properties are summarized in Table 9-3. As with most other Indexing Service properties, you can set these values globally or locally. Global property settings are inherited by all catalogs unless you override the global settings.

Table 9-3 Configurable Properties for the Indexing Service

Property Tab

Property

Description

Generation

Index Files With Unknown Extensions

Specifies whether the Indexing Service indexes files with unregistered extensions. These files are indexed by default, which could slow down the indexing process if you have a large number of files with unregistered extensions.

 

Generate Abstracts

Specifies whether the Indexing Service generates abstracts for files found in a search and returns them with the results. Abstracts contain key information gathered from documents that match the search parameters. Abstracts are generated by default.

 

Maximum Size

Sets the maximum number of characters in the abstracts returned with a search. The default value is 320. The range of permitted values is from 10 to 10,000. Keep in mind this property is only available when you enable Generate Abstracts.

Tracking

Add Network Shares Automatically

Specifies whether the Indexing Service automatically uses network share names as aliases for shared network drives. If you don't select this option, you must manually configure aliases for each network share you want to index, as described in the "Adding Physical Directories to a Catalog" section of this chapter.

You configure global property settings by completing the following steps:

  1. Start Computer Management, and then expand the Services And Applications node by clicking the plus sign (+) next to it.

  2. Right-click the Indexing Service node and then select Properties.

  3. On the Generation tab, you set properties that control the way indexing and search results are handled. Set or clear these properties as appropriate.

  4. On the Tracking tab, you set properties for tracking network shares. Set or clear the related property as appropriate.

  5. Click OK. If you want catalogs to inherit these values, check the properties of each catalog to ensure that Inherit Above Settings From Service is selected as appropriate on the Generation and Tracking tabs.

Individual catalogs can inherit or override the global settings. To perform these tasks, complete the following steps:

  1. Start Computer Management. Expand the Services And Applications node by clicking the plus sign (+) next to it, and then expand the Indexing Service node.

  2. You should see a list of catalogs configured on the server. Right-click the catalog you want to work with and then select Properties.

  3. On the Generation tab, you set properties that control the way indexing and search results are handled. If you want the catalog to inherit the global settings, select Inherit Above Settings From Service. Otherwise, clear this check box and then change the properties as necessary.

  4. On the Tracking tab, you set properties for tracking network shares. If you want the catalog to inherit the global settings, select Inherit Above Settings From Service. Otherwise, clear this check box and then change the properties as necessary.

  5. Click OK.

Optimizing Indexing Service Performance

You can optimize the Indexing Service performance based on expected usage. You do this by controlling the way the Indexing Service manages the indexing and querying processes. Each process has different performance settings that can be set by using fixed or custom optimization values. For indexing, the performance options are the following:

  • Lazy The Indexing Service minimizes the amount of system resources reserved for indexing. Additionally, the Indexing Service doesn't immediately respond to change notification requests from the operating system, and consequently reduces the frequency of scanning. Best for environments in which documents are updated or modified infrequently.

  • Moderate The Indexing Service reserves the normal amount of system resources for indexes and attempts to handle change notification requests in a timely manner. This is the default setting. Best for the typical environment in which changes are made daily to documents configured for indexing.

  • Instant The Indexing Service reserves additional system resources for indexing and aggressively responds to change notification requests, which means higher than normal scanning for new and changed documents. As a result, document changes and additions appear quickly in catalogs. Best for environments in which documents are changing rapidly and in which you need to reflect the changes quickly.

The available optimization settings are

  • Low Load The Indexing Service reduces the amount of system resources reserved for querying. Therefore, the Indexing Service can handle only a limited number of simultaneous queries. Best for environments in which queries are infrequent. If the number of queries increases too much, the responsiveness of the service will be poor.

  • Moderate Load The Indexing Service reserves the normal amount of system resources for querying, and attempts to handle multiple simultaneous requests. This is the default setting. Best for the typical environment in which users are regularly performing queries, and you want to handle them appropriately.

  • Heavy Load The Indexing Service reserves additional system resources for querying and is able to handle a larger than usual number of simultaneous requests. Best when you need to handle a large number of queries and don't care if the Indexing Service uses more memory and central processing unit (CPU) time than usual.

You can optimize the Indexing Service performance by completing the following steps:

  1. Start Computer Management, and then expand the Services And Applications node by clicking the plus sign (+) next to it.

  2. Select the Indexing Service node in the left pane. The right pane displays the status conditions for each individual catalog.

  3. Right-click the Indexing Service node in the left pane and then select Stop.

  4. Right-click the Indexing Service node again, point to All Tasks, and then select Tune Performance. The Indexing Service Usage dialog box shown in Figure 9-3 is displayed.

    Bb727099.iis0903(en-us,TechNet.10).gif

    Figure 9-3: Use the Indexing Service Usage dialog box to optimize indexing and querying.

    You can set a fixed or custom optimization value. To set fixed values, select one of the following options on the Indexing Service Usage dialog box:

    • Dedicated Server Sets instant indexing and heavy load querying options.

    • Used Often, But Not Dedicated To This Service Sets lazy indexing and moderate load querying options.

    • Used Occasionally Sets lazy indexing and low load querying options.

    • Never Used Disables the Indexing Service (as if you had disabled it from the Services node). Once selected, the Indexing Service stops permanently unless you re-enable it manually.

    To set a custom optimization value, select the Customize option and then click the Customize button. As shown in Figure 9-4, you can do the following:

    • Use the Indexing slider to configure indexing as Lazy, Moderate, or Instant. The moderate value is the middle option, and it isn't labeled.

    • Use the Querying slider to configure querying handling as Low Load, Moderate Load, or Heavy Load. The moderate value is the middle option, and it isn't labeled.

  5. Click OK. Click OK again to save your settings and return to the Computer Management snap-in.

    Bb727099.iis0904(en-us,TechNet.10).gif

    Figure 9-4: You can customize the way indexing and querying are performed using the Desired Performance dialog box.

Managing Catalogs

The Indexing Service stores all of the information you are indexing in catalogs. Catalogs contain the extracted contents from the main body of documents as well as metadata that describes the document and its properties. During the catalog creation process, you specify which Web site you want to associate the catalog with. Once you create a catalog for a Web site, users can search it using a Web-based query form.

Catalogs are maintained automatically by the Indexing Service and are updated through the scan and merge processes. You can manually control catalogs as well by starting, stopping, or pausing the update monitor for the catalog. You can also force the Indexing Service to merge separate indexes into the master to improve the overall performance and responsiveness of the Indexing Service.

Viewing Catalog Properties and Directories Being Indexed

Each catalog configured on the server has a separate set of properties that you can manage. These properties control the tracking of network shares, the generation of document abstracts, and the indexing configuration. You can configure catalogs to have unique property settings or to inherit global properties from the Indexing Service.

Catalogs can be associated with a Web site, an NNTP site, and one or more external directories. External directories can include local and remote resources. When you associate a catalog with a Web or NNTP site, you use the Internet Information Services snap-in to specify which resources are indexed. When you associate a catalog with a network share, you can elect to index the directory when you add it to the catalog.

To view the current property settings for a catalog as well as the directories that are currently being indexed, follow these steps:

  1. Start Computer Management. Expand the Services And Applications node by clicking the plus sign (+) next to it, and then expand the Indexing Service node.

  2. You should see a list of catalogs configured on the server. Expand a catalog node by clicking the plus sign (+) next to it. Select the Directories node in the left pane to display a list of external directories associated with a catalog in the right pane.

  3. If you want to view the properties of a catalog, right-click the catalog you want to work with and then select Properties. This displays a Properties dialog box that you can use to view or set properties.

Adding Physical Directories to a Catalog

You can add external directories to a catalog that can be indexed along with the content of a Web or NNTP site. These external directories can be on the local file system or on a remote file system. If you don't select Add Network Share Alias Automatically, you must manually configure aliases for each network share you want to index.

To add an external directory to a catalog, follow these steps:

  1. Start Computer Management. Expand the Services And Applications node by clicking the plus sign (+) next to it, and then expand the Indexing Service node.

  2. You should see a list of catalogs configured on the server. Right-click the catalog you want to work with, point to New, and then select Directory. This displays the Add Directory dialog box shown in Figure 9-5.

    Bb727099.iis0905(en-us,TechNet.10).gif

    Figure 9-5: You can add physical directories to a catalog and map them to aliases using this dialog box.

  3. In the Path field, type the complete file path to the directory you want to index. If you do not know the directory path, click Browse to search for the directory.

  4. If you are configuring indexing for a network share, type the network share alias that you want to use for this directory in the Alias (UNC) field. This alias should be in Uniform Naming Convention (UNC) format and is returned in the search results sent to clients. For example, you could set the Alias \\myserver\data to map to the actual network share path \\Galileo\reports\fy2001.

    Tip When you work with remote systems, you must allow the Indexing Service to map administrative shares. If unable to map administrative shares, the Indexing Service will not be able to index content.

  5. If you are configuring indexing for a network share, you can also set the User Name and Password that the Indexing Service can use to authenticate on the remote system.

  6. Next, select Yes to specify whether the directory should be included in the catalog index. Select No to exclude the directory from the index.

  7. Click OK.

Forcing Full and Incremental Directory Rescans

The Indexing Service watches for change notification requests from the operating system to determine if files have been added to or changed within directories set for indexing. When a request is received, the Indexing Service schedules the related directory for an incremental scan. At times, the Indexing Service may lose change notifications. This can happen during periods of high I/O or CPU processing; the Indexing Service may not be able to keep up with the change notifications. It can also happen when the Indexing Service is unable receive change notifications for directories on remote systems.

Typically, you can identify a problem with scanning by searching for documents that have been updated recently or added to an indexed directory. If the search results do not contain references to these documents, you may need to force a full or incremental rescan. You can do this only at the external directory level.

To force a directory rescan of an external directory, follow these steps:

  1. Start Computer Management. Expand the Services And Applications node by clicking the plus sign (+) next to it, and then expand the Indexing Service node.

  2. You should see a list of catalogs configured on the server. Double-click the catalog you want to work with, and then select the related Directories node.

  3. In the right pane, you should see a list of external directories configured for the catalog. Right-click the directory you want to work with, point to All Tasks, and then select Rescan (Full) or Rescan (Incremental) as appropriate.

  4. When prompted, confirm the action by clicking Yes. Keep in mind that rescans of directories with a large number of documents can be resource-intensive. This means you'll use additional CPU, memory, and file I/O resources during the rescan.

Starting, Stopping, and Pausing Individual Catalogs

When you need to perform a large number of updates to directories monitored by a catalog, it is a good idea to temporarily pause or stop the catalog. Pausing or stopping the catalog tells the Indexing Service that it shouldn't handle change notification requests for this catalog. The difference between pausing and stopping a catalog is important. When you stop a catalog, the Indexing Service stops both indexing and querying activities, meaning that the related directories are no longer indexed and that users cannot search the catalog. When you pause a catalog, Indexing Service stops indexing but still allows current queries to be completed.

To start, stop, or pause a catalog, complete the following steps:

  1. Start Computer Management. Expand the Services And Applications node by clicking the plus sign (+) next to it, and then expand the Indexing Service node.

  2. Right-click the catalog you want to work with, point to All Tasks, and then select Start, Pause, or Stop as appropriate.

    Note: The Indexing Service automatically performs an incremental scan when you stop and then restart a catalog. This ensures that updated or new documents are indexed as appropriate.

Merging Catalogs

As the Indexing Service updates the catalog, it creates temporary indexes, called shadow indexes, which extend the master index. These shadow indexes reflect the changes within catalog directories. Over time, the number of shadow indexes can grow substantially, and this is reflected in the number of Saved Indexes associated with a catalog. Because shadow indexes contain additional pointers and information, they use more space than a fully merged master index. As the number of shadow indexes grows, the responsiveness of queries against the catalog can slow.

You can improve the responsiveness of the Indexing Service and reduce storage space usage by merging the temporary indexes with the master index. To perform this task, complete the following steps:

  1. Start Computer Management. Expand the Services And Applications node by clicking the plus sign (+) next to it, and then expand the Indexing Service node.

  2. Right-click the catalog you want to work with, point to All Tasks, and then select Merge.

  3. When prompted to confirm the action, click Yes. As with rescanning, the merge process can be resource-intensive, and you may temporarily reduce the responsiveness of the Indexing Service. The net gain, however, is that once merging is completed, the Indexing Service should be more responsive to user queries.

Specifying Web or NNTP Sites to Include in Catalogs

Each catalog can be associated with one Web site and one NNTP site. After you associate a site with a catalog, you can use the Internet Information Services snap-in to specify the resources that should be indexed. You specify the site to include in a catalog by completing these steps:

  1. Start Computer Management. Expand the Services And Applications node by clicking the plus sign (+) next to it, and then expand the Indexing Service node.

  2. Right-click the catalog you want to work with, and then select Properties. Click the Tracking tab.

    As shown in Figure 9-6, you can now take one of the following actions:

    • Use the WWW Server selection menu on the Tracking tab to specify the Web site that you want to associate with a catalog

    • Use the NNTP Server selection menu on the Tracking tab to specify an NNTP site that you want to associate with a catalog

  3. Click OK.

    Bb727099.iis0906(en-us,TechNet.10).gif

    Figure 9-6: Specify the site to index on the Tracking tab.

Testing Catalogs with Queries

After you configure a catalog for indexing, you should query the catalog to ensure you get the expected results. The Indexing Service has a built-in query form to perform this task. To access this form and enter a query, follow these steps:

  1. Start Computer Management. Expand the Services And Applications node by clicking the plus sign (+) next to it, and then expand the Indexing Service node.

  2. You should see a list of catalogs configured on the server. Double-click the catalog you want to work with, and then select Query The Catalog in the left pane.

  3. As shown in Figure 9-7, type the query you want to use in the field labeled Enter Your Free Text Query Below, and then click Search. If indexing is configured correctly, the Indexing Service should display search results. Then click on a document title or path entry to ensure that documents can be accessed from the results page. If you experience problems with either of these procedures, you should check the indexing configuration.

    Bb727099.iis0907(en-us,TechNet.10).gif

    Figure 9-7: After you configure indexing, check the configuration using the predefined query form.

Link Click to order