Word Breakers and Stemmers

Word breakers and stemmers perform linguistic analysis on all full-text indexed data. Linguistic analysis involves finding word boundaries (word-breaking) and conjugating verbs (stemming). The rules for this analysis differ for different languages, and you can specify a different language for each full-text indexed column. Word breakers for each language enable the resulting terms to be more accurate for that language. In the case where there is a word breaker for the language family, but not for the specific sub-language, the major language is used. For example, the French word breaker is used to handle text that is French Canadian. If no word breaker is available for a particular language, the neutral word breaker is used. With the neutral word breaker, words are broken at neutral characters such as spaces and punctuation marks.

Microsoft SQL Server 2005 includes word breakers for 23 locales. For a list of languages supported by Full-Text Search, see sys.fulltext_languages (Transact-SQL).

The language of the full-text indexed column being queried determines the linguistic analysis performed on arguments of the full-text query functions, CONTAINS, FREETEXT, CONTAINSTABLE, and FREETEXTTABLE. If no language is specified for a column, the default is the value of the configuration option default full-text language.

For a localized version of SQL Server, SQL Server Setup sets the default full-text language option to the language of the server if an appropriate match exists. For a non-localized version of SQL Server, the default full-text language option is English.

Note

All columns listed in a single full-text query function clause must use the same language, unless the LANGUAGE option is specified in the query.

See Also

Concepts

Full-Text Search Fundamentals

Other Resources

default full-text language Option

Help and Information

Getting SQL Server 2005 Assistance