Export (0) Print
Expand All

Semantic Search (SQL Server)

Statistical Semantic Search provides deep insight into unstructured documents stored in SQL Server databases by extracting and indexing statistically relevant key phrases. Then it also uses these key phrases to identify and index documents that are similar or related.

You query these semantic indexes by using three Transact-SQL rowset functions to retrieve the results as structured data.

Semantic search builds upon the existing full-text search feature in SQL Server, but enables new scenarios that extend beyond keyword searches. While full-text search lets you query the words in a document, semantic search lets you query the meaning of the document. Solutions that are now possible include automatic tag extraction, related content discovery, and hierarchical navigation across similar content. For example, you can query the index of key phrases to build the taxonomy for an organization, or for a corpus of documents. Or, you can query the document similarity index to identify resumes that match a job description.

The following examples demonstrate the capabilities of Semantic Search.

Find the Key Phrases in a Document

The following query gets the key phrases that were identified in the sample document. It presents the results in descending order by the score that ranks the statistical significance of each key phrase. This query calls the semantickeyphrasetable (Transact-SQL) function.

SET @Title = 'Sample Document.docx'

SELECT @DocID = DocumentID
    FROM Documents
    WHERE DocumentTitle = @Title

SELECT @Title AS Title, keyphrase, score
    FROM SEMANTICKEYPHRASETABLE(Documents, *, @DocID)
    ORDER BY score DESC

TOP

Find Similar or Related Documents

The following query gets the documents that were identified as similar or related to the sample document. It presents the results in descending order by the score that ranks the similarity of the 2 documents. This query calls the semanticsimilaritytable (Transact-SQL) function.

SET @Title = 'Sample Document.docx'

SELECT @DocID = DocumentID
    FROM Documents
    WHERE DocumentTitle = @Title

SELECT @Title AS SourceTitle, DocumentTitle AS MatchedTitle,
        DocumentID, score
    FROM SEMANTICSIMILARITYTABLE(Documents, *, @DocID)
    INNER JOIN Documents ON DocumentID = matched_document_key
    ORDER BY score DESC

TOP

Find the Key Phrases That Make Documents Similar or Related

The following query gets the key phrases that make the 2 sample documents similar or related to one another. It presents the results in descending order by the score that ranks the weight of each key phrase. This query calls the semanticsimilaritydetailstable (Transact-SQL) function.

SET @SourceTitle = 'first.docx'
SET @MatchedTitle = 'second.docx'

SELECT @SourceDocID = DocumentID FROM Documents WHERE DocumentTitle = @SourceTitle
SELECT @MatchedDocID = DocumentID FROM Documents WHERE DocumentTitle = @MatchedTitle

SELECT @SourceTitle AS SourceTitle, @MatchedTitle AS MatchedTitle, keyphrase, score
    FROM semanticsimilaritydetailstable(Documents, DocumentContent,
        @SourceDocID, DocumentContent, @MatchedDocID)
    ORDER BY score DESC

TOP

Before you can index documents with Semantic Search, you have to store the documents in a SQL Server database.

The FileTable feature in SQL Server 2014 makes unstructured files and documents first-class citizens of the relational database. As a result, database developers can manipulate documents together with structured data in Transact-SQL set-based operations.

For more information about the FileTable feature, see FileTables (SQL Server). For information about the FILESTREAM feature, which is another option for storing documents in the database, see FILESTREAM (SQL Server).

TOP

Install and Configure Semantic Search

Describes the prerequisites for statistical semantic search and how to install or check them.

Enable Semantic Search on Tables and Columns

Describes how to enable or disable statistical semantic indexing on selected columns that contain documents or text.

Find Key Phrases in Documents with Semantic Search

Describes how to find the key phrases in documents or text columns that are configured for statistical semantic indexing.

Find Similar and Related Documents with Semantic Search

Describes how to find similar or related documents or text values, and information about how they are similar or related, in columns that are configured for statistical semantic indexing.

Manage and Monitor Semantic Search

Describes the process of semantic indexing and the tasks related to monitoring and managing the indexes.

Semantic Search DDL, Functions, Stored Procedures, and Views

Lists the Transact-SQL statements and the SQL Server database objects added or changed to support statistical semantic search.

Was this page helpful?
(1500 characters remaining)
Thank you for your feedback

Community Additions

ADD
Show:
© 2014 Microsoft