Export (0) Print
Expand All

Full-Text Index Structure

A good understanding of the structure of a full-text index will help you understand how the Microsoft Full-Text Engine for SQL Server (MSFTESQL) service works. The following excerpt of the Document table in Adventure Works shows two columns and three rows from the table, the DocumentID column and the Title column.

For this example, we will assume that a full-text index has been created on the Title column.

DocumentID Title

1

Crank Arm and Tire Maintenance

2

Front Reflector Bracket and Reflector Assembly 3

3

Front Reflector Bracket Installation

The table fragment below depicts the contents of the full-text index created on the Title column of the Document table.

ms142505.note(en-US,SQL.90).gifNote:
Full-text indexes contain more information than what is presented in this table. The table below is provided for demonstration purposes only.

Keyword ColId DocId Occ

Crank

1

1

1

Arm

1

1

2

Tire

1

1

4

Maintenance

1

1

5

Front

1

2

1

Front

1

3

1

Reflector

1

2

2

Reflector

1

2

5

Reflector

1

3

2

Bracket

1

2

3

Bracket

1

3

3

Assembly

1

2

6

3

1

2

7

Installation

1

3

4

The Keyword column contains a representation of a single token extracted at indexing time. Word breakers determine what makes up a token.

The ColId column contains a value that corresponds to a particular table and column that is full-text indexed.

The DocId column contains values for a four-byte integer that maps to a particular full-text key value in a full-text indexed table. DocId values that satisfy a search condition are passed from the MSFTESQL service to the Database Engine, where they are mapped to full-text key values from the base table being queried.

The Occ column contains an integer value. For each DocId value, there is a list of occurrence values that correspond to the relative word offsets of the particular keyword within that DocId. Occurrence values are useful in determining phrase or proximity matches, for example, phrases have numerically adjacent occurrence values. They are also useful in computing relevance scores; for example, the number of occurrences of a keyword in a DocId may be used in scoring.

Was this page helpful?
(1500 characters remaining)
Thank you for your feedback

Community Additions

ADD
Show:
© 2014 Microsoft