Export (0) Print
Expand All

Behavior Changes to Full-Text Search

Topic Status: Some information in this topic is preview and subject to change in future releases. Preview information describes new features or changes to existing features in Microsoft SQL Server 2016 Community Technology Preview 2 (CTP2).

This topic describes behavior changes in full-text search. Behavior changes affect how features work or interact in SQL Server 2016 as compared to earlier versions of SQL Server.

SQL Server 2012 installs a new version of the word breakers and stemmers for US English (LCID 1033) and UK English (LCID 2057). However you can switch to the previous version of these components if you want to retain the previous behavior. For more information, see Change the Word Breaker Used for US English and UK English.

New Word Breakers and Stemmers Installed

SQL Server 2012 updates all the word breakers and stemmers used by Full-Text Search and Semantic Search. For consistency between the contents of indexes and the results of queries, we recommend that you repopulate existing full-text indexes.

  1. There are new word breakers for English. If you have to retain the previous behavior, see Change the Word Breaker Used for US English and UK English.

  2. The third-party word breakers for Danish, Polish, and Turkish that were included with previous releases of SQL Server have been replaced with Microsoft components. The new components are enabled by default.

  3. There are new word breakers for Czech and Greek. Previous releases of SQL Server Full-Text Search did not include support for these two languages.

Behavior Changes of New Word Breakers and Stemmers

The new components might return different results than the older components when you populate and query full-text indexes. The following tables demonstrate some of the differences that can be expected in English results.

If you have to retain the previous behavior of the word breakers and stemmers, see the following topics:

In some cases, the new components return more results:

Term

Results with previous word breaker and stemmer

Results with new word breaker and stemmer

cat-dog

cat

dog

cat

cat-dog

dog

cat@dog.com

cat

com

dog

cat

cat@dog.com

com

dog

12/11/2011

(where the term is a date)

12/11/2011

dd20111211

11

12

12/11/2011

2011

dd20111211

In some cases, the new components return similar results:

Term

Results with previous word breaker and stemmer

Results with new word breaker and stemmer

100$

100$

nn100$

100$

nn100usd

022

022

nn022

022

nn22

10:49AM

(where the term is a time)

10:49am

tt1049

10:49am

tt24104900

In some cases the new components return fewer results or results that may be unexpected by applications:

Term

Results with previous word breaker and stemmer

Results with new word breaker and stemmer

jěˊÿqℭžl

(where the terms are not valid English characters)

‘jěˊÿqℭžl’

je yq zl

table's

table’s

table

table’s

cat-

cat

cat-

cat

v-z(where v and z are noise words)

(no results)

v-z

$100 000 USD

$100

000

nn000

nn100$

usd

$100 000 usd

nn100000usd

beautiful U.S land

beautiful

land

u.s

us

beautiful

land

Mt. Kent and Mt Challenger

challenger

kent

mt

mt.

mt

kent

challenger

Was this page helpful?
(1500 characters remaining)
Thank you for your feedback

Community Additions

ADD
Show:
© 2015 Microsoft