Full-Text Search Queries

To write full-text queries in Microsoft SQL Server 2005, you must learn how to use the CONTAINS and FREETEXT Transact-SQL predicates, and the CONTAINSTABLE and FREETEXTTABLE rowset-valued functions.

The CONTAINS and FREETEXT predicates can only be used inside a WHERE clause of a SELECT statement. CONTAINSTABLE and FREETEXTTABLE rowset-valued functions can be used in the FROM clause of a SELECT statement.

The CONTAINS predicate supports rich syntax to search columns with textual data for:

  • One or more specific words or phrases (simple term).
    In Full-Text Search, a word is considered to be a token. A token is identified by appropriate wordbreakers, following the linguistic rules of the specified language. A valid phrase can consist of multiple words, with or without punctuation between them. For example, "croissant" is a word, and "café au lait" is a phrase. Words and phrases such as these are called simple terms.
  • Inflectional forms of a specific word (generation term).
    For example, search for the inflectional form of the word "drive". If various rows in the table include the words "drive", "drives", "drove", "driving", and "driven", all would be in the result set because each of these can be inflectionally generated from the word drive.
  • Synonymous forms of a specific word (thesaurus)
    For example, if an entry, "{car, automobile, truck, van}", is added to a thesaurus, you can search for the thesaurus form of the word "car". All rows in the table queried that include the words "automobile", "truck", "van", or "car", appear in the result set because each of these words belong to the synonym expansion set containing the word "car". For more information about the structure of thesaurus files, see Configuring Thesaurus Files.
  • A word or a phrase where the words begin with specified text (prefix term).
    In case of a phrase, each word within the phrase is considered to be a prefix. For example, the term "auto tran*" matches "automatic transmission" and "automobile transducer", but it does not match "automatic motor transmission".
  • Words or phrases using weighted values (weighted term).
    For example, in a query searching for multiple terms, you can assign each search word a weight value indicating its importance relative to the other words in the search condition. The results for this type of query return the most relevant rows first, according to the relative weight you have assigned to search words.
  • A word or phrase close to another word or phrase (proximity term).
    For example, you want to find the rows in which the word "ice" is near the word "hockey" or in which the phrase "ice skating" is near the phrase "ice hockey".

A CONTAINS predicate can combine several of these terms by using Boolean operators AND, OR, AND NOT, for example, to find all rows with "latte" and "New York-style bagel" in the same full-text indexed column. Terms can be negated by the use of AND NOT, for example "bagel" and not "cream cheese".

When you use CONTAINS, remember SQL Server discards noise words from the search criteria. Noise words are those words such as "a," "and," "is," or "the," which can occur frequently but do not really help when searching for specific text. For more information about noise words, see Noise Words.

See Also

Concepts

Querying SQL Server Using Full-Text Search

Other Resources

CONTAINS (Transact-SQL)

Help and Information

Getting SQL Server 2005 Assistance