Edit a noise word file (Office SharePoint Server)
Updated: December 11, 2008
Applies To: Office SharePoint Server 2007
A noise word is a word that is not useful in a search, for example, words such as “the” and “an”. Noise word files, sometimes also referred to as “stop word” files, contain lists of words to be excluded or ignored when a user runs a query. These lists might include words that are irrelevant to the search, such as conjunctions, articles, adjectives, and adverbs, as well as common names, and offensive or inappropriate words.
In this article:
Understanding noise word files
A list of noise words for a language is stored in the noise word file for that language. If a noise word list does not exist for a language, Microsoft Office SharePoint Server 2007 uses the neutral noiseneu.txt noise word file. The word breaker for a given language identifies individual words by determining where word boundaries exist based on the lexical rules of the language. When a word breaker for a particular language encounters words during indexing or query time, the word breaker removes words that are listed in the noise word file. For more information about what languages are supported by noise word files, see the "List of noise word files by language" section.
By default, noise word files are created and stored in the following location on the query server: Drive:\Program Files\Microsoft Office Servers\12.0\Data\Config. Noise word files from that default location are copied to the following folder location for each instance of the Microsoft Search service that exists on the query server: Drive:\Program Files\Microsoft Office Servers\12.0\Data\Applications\\<Application UID>\Config, where <Application UID> is the GUID associated with each instance of the Search service.
If you modify the noise word files in the default location, the modified version of the files will automatically be copied every time a new Shared Service Provider (SSP) is created. If you modify the noise word files in the default location after a SSP has been created, you will need to copy the files from the default location to the specified directory for each SSP that already exists.
If you add noise words, the accuracy of your searches may decrease. However, the size of the content index also decreases. A smaller content index helps increase performance. You can delete noise words if you want searches to return those words.
If you remove words from the noise word file, the changes do not take effect until you reset the content index and perform a full crawl of all content that contains the keywords you removed. If you add words to the noise word file, it is not necessary to perform a full crawl of all content, because the newly added words will not be searched for. However, the size of the index will not decrease until a full crawl has been performed.
Do not delete noise word files. If you do not want noise words removed during an update or a query, remove those specific entries from the file. If you delete the noise word file, all single characters are removed as noise words.
Editing a noise word file
Use the following procedure to edit a noise word file.
Edit a noise word file
Start Notepad, and then open the noise word file. For information on locating and identifying the appropriate noise word file, see the "Understanding noise word files" section.
Edit the list to include only the words that you want to be ignored in a search query.
Save the noise word file, and then close Notepad.
When saving a modified noise word file, always use the default Encoding value.
Restart the Office SharePoint Server Search service using the following steps:
Click Start, point to Administrative Tools, and then click Services.
Right-click Office SharePoint Server Search, and then click Restart.
In order for Search to utilize the changes to the noise word file, you must start a full crawl of your content source. For information about how to do this, see Start a full crawl (Office SharePoint Server 2007).
List of noise word files by language
Office SharePoint Server 2007 include noise word files for the following languages:
English (United Kingdom)
English (United States)