Noise Words

To prevent a full-text index from becoming bloated, Microsoft SQL Server has a mechanism that discards commonly occurring words that do not help the search. These words are called noise words, or stop words. Noise words are listed in the locale specific noise word files. For example, in the English locale, words such as "a," "and," "is," and "the" are in the English noise word file and are left out of the full-text index since they are empirically known to be useless to a search. However, the full-text index does take into account the position of noise words. For example, consider the phrase, "Instructions are applicable to these Adventure Works Cycles models". The following table depicts the position of the words in the phrase:

Word or token Position

Instructions

1

are

2

applicable

3

to

4

these

5

Adventure

6

Works

7

Cycles

8

models

9

The noise words "are", "to" and "these" that are in positions 2, 4, and 5 are left out of the full-text index. However, their positional information is maintained, thereby leaving the position of the other words in the phrase unaffected.

The noise word files are located in the $SQL_Server_Install_Path\Microsoft SQL Server\MSSQL.1\MSSQL\FTDATA\ directory. This directory is created, and the noise-word files are installed when you set up SQL Server with the Full-Text Search support. Noise-word files can be edited, so, for example, system administrators at high-tech companies might add the word "computer" to their noise-word list.

Important

If you edit a noise-word file, you must repopulate the full-text catalogs before the changes will take effect.

The table shows the noise-word files and their respective languages.

Noise-word file Language

Noisechs

Simplified Chinese

Noisecht

Traditional Chinese

Noisedan

Danish

Noisedeu

German

Noiseeng

English UK

Noiseenu

English US

Noiseesn

Spanish

Noisefra

French

Noiseita

Italian

Noisejpn

Japanese

Noisekor

Korean

Noiseneu

Neutral language

Noisenld

Dutch

Noiseplk

Polish

Noiseptb

Portuguese-Brazilian

Noisepts

Portuguese-Iberian

Noiserus

Russian

Noisesve

Swedish

Noisetha

Thai

Noisetrk

Turkish

See Also

Concepts

Full-Text Search Fundamentals

Help and Information

Getting SQL Server 2005 Assistance