Full-Text Search Fundamentals
This topic briefly describes the components, processes, and terminology associated with Full-Text Search. Full-Text Search shares many terms with Microsoft SQL Server, but there are also a number of terms, such as "crawl" and "token," that are peculiar to Full-Text Search.
Here is a list of terms and components that you need to be familiar with when using Full-Text Search.
- Full-text index
Stores information about significant words and their location within a given column. This information is used to quickly compute full-text queries that search for rows with particular words or combinations of words. For more information, see Full-Text Indexes.
- Full-text catalog
A full-text catalog contains zero or more full-text indexes. Full-text catalogs must reside on a local hard drive associated with the instance of SQL Server. Each catalog can serve the indexing needs of one or more tables within a database. Full-text catalogs cannot be stored on removable drives, floppy disks, or network drives, except when you attached a read-only database that contains a full-text catalog.
- Word breaker
For a given language, a word breaker tokenizes text based on the lexical rules of the language. For more information, see Word Breakers and Stemmers.
Is a word or a character string identified by the word breaker.
For a given language, a stemmer generates inflectional forms of a particular word based on the rules of that language. Stemmers are language specific. For more information, see Word Breakers and Stemmers.
Given a specified file type, for example .doc, filters extract text from a file stored in a varbinary(max) or image column. For more information, see Full-Text Search Filters.
- Population or Crawl
Is the process of creating and maintaining a full-text index. For more information, see Full-Text Index Structure.
- Noise words
Are frequently occurring words that do not help the search. For example, for the English locale words such as "a", "and", "is", and "the" are considered noise words. These words are ignored to prevent the full-text index from becoming bloated. For more information, see Noise Words.
|Full-text indexing is fully supported in a Microsoft Windows failover cluster environment.|