Full-Text Index Structure

Article
12/03/2008

A good understanding of the structure of a full-text index will help you understand how the Microsoft Full-Text Engine for SQL Server (MSFTESQL) service works. The following excerpt of the Document table in Adventure Works shows two columns and three rows from the table, the DocumentID column and the Title column.

For this example, we will assume that a full-text index has been created on the Title column.

DocumentID	Title
1	Crank Arm and Tire Maintenance
2	Front Reflector Bracket and Reflector Assembly 3
3	Front Reflector Bracket Installation

The table fragment below depicts the contents of the full-text index created on the Title column of the Document table.

Note

Full-text indexes contain more information than what is presented in this table. The table below is provided for demonstration purposes only.

Keyword	ColId	DocId	Occ
Crank	1	1	1
Arm	1	1	2
Tire	1	1	4
Maintenance	1	1	5
Front	1	2	1
Front	1	3	1
Reflector	1	2	2
Reflector	1	2	5
Reflector	1	3	2
Bracket	1	2	3
Bracket	1	3	3
Assembly	1	2	6
3	1	2	7
Installation	1	3	4

The Keyword column contains a representation of a single token extracted at indexing time. Word breakers determine what makes up a token.

The ColId column contains a value that corresponds to a particular table and column that is full-text indexed.

The DocId column contains values for a four-byte integer that maps to a particular full-text key value in a full-text indexed table. DocId values that satisfy a search condition are passed from the MSFTESQL service to the Database Engine, where they are mapped to full-text key values from the base table being queried.

The Occ column contains an integer value. For each DocId value, there is a list of occurrence values that correspond to the relative word offsets of the particular keyword within that DocId. Occurrence values are useful in determining phrase or proximity matches, for example, phrases have numerically adjacent occurrence values. They are also useful in computing relevance scores; for example, the number of occurrences of a keyword in a DocId may be used in scoring.

Full-Text Index Structure

See Also

Concepts

Other Resources

Help and Information

Additional resources