Behavior Changes to Full-Text Search

This topic describes behavior changes in full-text search. Behavior changes affect how features work or interact in SQL Server 2012 as compared to earlier versions of SQL Server.

Behavior Changes in Full-Text Search in SQL Server 2012

SQL Server 2012 installs a new version of the word breakers and stemmers for US English (LCID 1033) and UK English (LCID 2057). However you can switch to the previous version of these components if you want to retain the previous behavior. For more information, see Change the Word Breaker Used for US English and UK English.

New Word Breakers and Stemmers Installed

SQL Server 2012 updates all the word breakers and stemmers used by Full-Text Search and Semantic Search. For consistency between the contents of indexes and the results of queries, we recommend that you repopulate existing full-text indexes.

  1. There are new word breakers for English. If you have to retain the previous behavior, see Change the Word Breaker Used for US English and UK English.

  2. The third-party word breakers for Danish, Polish, and Turkish that were included with previous releases of SQL Server have been replaced with Microsoft components. The new components are enabled by default.

  3. There are new word breakers for Czech and Greek. Previous releases of SQL Server Full-Text Search did not include support for these two languages.

Behavior Changes of New Word Breakers and Stemmers

The new components might return different results than the older components when you populate and query full-text indexes. The following tables demonstrate some of the differences that can be expected in English results.

If you have to retain the previous behavior of the word breakers and stemmers, see the following topics:

In some cases, the new components return more results:

Term

Results with previous word breaker and stemmer

Results with new word breaker and stemmer

cat-dog

cat

dog

cat

cat-dog

dog

cat@dog.com

cat

com

dog

cat

cat@dog.com

com

dog

12/11/2011

(where the term is a date)

12/11/2011

dd20111211

11

12

12/11/2011

2011

dd20111211

In some cases, the new components return similar results:

Term

Results with previous word breaker and stemmer

Results with new word breaker and stemmer

100$

100$

nn100$

100$

nn100usd

022

022

nn022

022

nn22

10:49AM

(where the term is a time)

10:49am

tt1049

10:49am

tt24104900

In some cases the new components return fewer results or results that may be unexpected by applications:

Term

Results with previous word breaker and stemmer

Results with new word breaker and stemmer

jěˊÿqℭžl

(where the terms are not valid English characters)

‘jěˊÿqℭžl’

je yq zl

table's

table’s

table

table’s

cat-

cat

cat-

cat

v-z(where v and z are noise words)

(no results)

v-z

$100 000 USD

$100

000

nn000

nn100$

usd

$100 000 usd

nn100000usd

beautiful U.S land

beautiful

land

u.s

us

beautiful

land

Mt. Kent and Mt Challenger

challenger

kent

mt

mt.

mt

kent

challenger

Behavior Changes in Full-Text Search in SQL Server 2008

In SQL Server 2008 and later versions, the Full-Text Engine is integrated as a database service into the relational database as part of the server query and storage engine infrastructure. The new full-text search architecture achieves the following goals:

  • Integrated storage and management—Full-text search is now integrated directly with the inherent storage and management features of SQL Server, and the MSFTESQL service no longer exists. 

    • Full-text indexes are stored inside the database filegroups, rather than in the file system. Administrative operations on a database, such as creating a backup, automatically affect its full-text indexes.

    • A full-text catalog is now a virtual object that does not belong to any filegroup; it is a logical concept that refers to a group of full-text indexes. Therefore, many catalog-management features have been deprecated, and deprecation has created breaking changes for some features. For more information, see Deprecated Database Engine Features in SQL Server 2012 and Breaking Changes to Full-Text Search.

      [!UWAGA]

      SQL Server 2005 Transact-SQL DDL statements that specify full-text catalogs work correctly.

  • Integrated query processing—The new full-text search query processor is part of the Database Engine and is fully integrated with the SQL Server Query processor. This means that, the query optimizer recognizes full-text query predicates and automatically executes them as efficiently as possible.

  • Enhanced administration and troubleshooting—Integrated full-text search provides tools to help you analyze search structures such as the full-text index, the output of a given word breaker, stopword configuration, and so forth.

  • Stopwords and stoplists have replaced noise words and noise-word files. A stoplist is a database object that facilitates manageability tasks for stopwords and improves the integrity between different server instances and environments. For more information, see Configure and Manage Stopwords and Stoplists for Full-Text Search.

  • SQL Server 2008 and later versions include new word breakers for many of the languages that exist in SQL Server 2005. Only the word breakers for English, Korean, Thai, and Chinese (all forms) remain the same. For other languages, if a full-text catalog was imported when a SQL Server 2005 database was upgraded to SQL Server 2008 or a later version, one or more languages used by the full-text indexes in full-text catalog might now be associated with new word breakers that might behave slightly differently from the imported word breakers. For more information about how to ensure consistency between queries and the full-text index content, see Upgrade Full-Text Search from SQL Server 2005.

  • A new FDHOST Launcher (MSSQLFDLauncher) service has been added. For more information, see Get Started with Full-Text Search.

  • Full-text indexing works with a FILESTREAM column in the same way that it does with a varbinary(max) column. The FILESTREAM table must have a column that contains the file name extension for each FILESTREAM BLOB. For more information, see Query with Full-Text Search,Configure and Manage Filters for Search, and sys.fulltext_document_types (Transact-SQL).

    The full-text engine indexes the contents of the FILESTREAM BLOBs. Indexing files such as images might not be useful. When a FILESTREAM BLOB is updated it is reindexed.

Zobacz także

Koncepcje

Full-Text Search (SQL Server)

Upgrade Full-Text Search from SQL Server 2005

Get Started with Full-Text Search

Inne zasoby

Full-Text Search Backward Compatibility