Edit a noise word file (Office SharePoint Server)

Applies To: Office SharePoint Server 2007

This Office product will reach end of support on October 10, 2017. To stay supported, you will need to upgrade. For more information, see , Resources to help you upgrade your Office 2007 servers and clients.

 

Topic Last Modified: 2015-03-09

A noise word is a word that is not useful in a search, for example, words such as “the” and “an”. Noise word files, sometimes also referred to as “stop word” files, contain lists of words to be excluded or ignored when a user runs a query. These lists might include words that are irrelevant to the search, such as conjunctions, articles, adjectives, and adverbs, as well as common names, and offensive or inappropriate words.

In this article:

  • Understanding noise word files

  • Editing a noise word file

  • List of noise word files by language

Understanding noise word files

A list of noise words for a language is stored in the noise word file for that language. If a noise word list does not exist for a language, Microsoft Office SharePoint Server 2007 uses the neutral noiseneu.txt noise word file. The word breaker for a given language identifies individual words by determining where word boundaries exist based on the lexical rules of the language. When a word breaker for a particular language encounters words during indexing or query time, the word breaker removes words that are listed in the noise word file. For more information about what languages are supported by noise word files, see the "List of noise word files by language" section.

By default, noise word files are created and stored in the following location on the query server: Drive:\Program Files\Microsoft Office Servers\12.0\Data\Config. Noise word files from that default location are copied to the following folder location for each instance of the Microsoft Search service that exists on the query server: Drive:\Program Files\Microsoft Office Servers\12.0\Data\Applications\\<Application UID>\Config, where <Application UID> is the GUID associated with each instance of the Search service.

Note

If you modify the noise word files in the default location, the modified version of the files will automatically be copied every time a new Shared Service Provider (SSP) is created. If you modify the noise word files in the default location after a SSP has been created, you will need to copy the files from the default location to the specified directory for each SSP that already exists.

If you add noise words, the accuracy of your searches may decrease. However, the size of the content index also decreases. A smaller content index helps increase performance. You can delete noise words if you want searches to return those words.

If you remove words from the noise word file, the changes do not take effect until you reset the content index and perform a full crawl of all content that contains the keywords you removed. If you add words to the noise word file, it is not necessary to perform a full crawl of all content, because the newly added words will not be searched for. However, the size of the index will not decrease until a full crawl has been performed.

Do not delete noise word files. If you do not want noise words removed during an update or a query, remove those specific entries from the file. If you delete the noise word file, all single characters are removed as noise words.

Editing a noise word file

Use the following procedure to edit a noise word file.

Edit a noise word file

  1. Start Notepad, and then open the noise word file. For information on locating and identifying the appropriate noise word file, see the "Understanding noise word files" section.

  2. Edit the list to include only the words that you want to be ignored in a search query.

  3. Save the noise word file, and then close Notepad.

    Note

    When saving a modified noise word file, always use the default Encoding value.

  4. Restart the Office SharePoint Server Search service using the following steps:

    1. Click Start, point to Administrative Tools, and then click Services.

    2. Right-click Office SharePoint Server Search, and then click Restart.

  5. In order for Search to utilize the changes to the noise word file, you must start a full crawl of your content source. For information about how to do this, see Start a full crawl (Office SharePoint Server 2007).

List of noise word files by language

Office SharePoint Server 2007 include noise word files for the following languages:

Language File name

Arabic

noiseara.txt

Bengali

noiseben.txt

Bulgarian

noisebul.txt

Catalan

noisecat.txt

Chinese (Simplified)

noisechs.txt

Chinese (Traditional)

noisecht.txt

Croatian

noisecro.txt

Danish

noisedan.txt

Dutch (Netherlands)

noisenld.txt

English (United Kingdom)

noiseeng.txt

English (United States)

noiseenu.txt

Finnish

noisefin.txt

French

noisefra.txt

German

noisedeu.txt

Greek

noisegrc.txt

Gujarati

noiseguj.txt

Hebrew

noiseheb.txt

Hindi

noisehin.txt

Icelandic

noiseice.txt

Indonesian

noiseind.txt

Italian

noiseita.txt

Japanese

noisejpn.txt

Kannada

noisekan.txt

Korean

noisekor.txt

Latvian

noiselat.txt

Lithuanian

noiselit.txt

Malay

noisemal.txt

Malayalam

noisemly.txt

Marathi

noisemar.txt

Neutral

noiseneu.txt

Norwegian (Bokmal)

noisenor.txt

Polish

noiseplk.txt

Polish

noisepol.txt

Portuguese

noisepor.txt

Portuguese (Brazil)

noiseptb.txt

Punjabi

noisepun.txt

Romanian

noiserom.txt

Russian

noiserus.txt

Serbian (Cyrillic)

noisesbc.txt

Serbian (Latin)

noisesbl.txt

Slovak

noisesvk.txt

Slovenian

noiseslo.txt

Spanish

noiseesn.txt

Swedish

noisesve.txt

Tamil

noisetam.txt

Telugu

noisetel.txt

Thai

noisetha.txt

Turkish

noisetur.txt

Ukrainian

noiseurk.txt

Urdu (Pakistan)

noiseurd.txt