Find Similar and Related Documents with Semantic Search

Applies to: SQL Server

Describes how to find similar or related documents or text values, and information about how they are similar or related, in columns that are configured for statistical semantic indexing.

Find similar or related documents with SEMANTICSIMILARITYTABLE

To identify similar or related documents in a specific column, query the function semanticsimilaritytable (Transact-SQL).

SEMANTICSIMILARITYTABLE returns a table of zero, one, or more rows whose content in the specified column is semantically similar to the specified document. This rowset function can be referenced in the FROM clause of a SELECT statement like a regular table name.

You cannot query across columns for similar documents. The SEMANTICSIMILARITYTABLE function only retrieves results from the same column as the source column, which is identified by the source_key argument.

For detailed information about the parameters required by the SEMANTICSIMILARITYTABLE function, and about the table of results that it returns, see semanticsimilaritytable (Transact-SQL).

Important

The columns that you target must have full-text and semantic indexing enabled.

Example: Find the top documents that are similar to another document

The following example retrieves the top 10 candidates who are similar to the candidate specified by @CandidateID from the HumanResources.JobCandidate table in the AdventureWorks2022 sample database.

SELECT TOP(10) KEY_TBL.matched_document_key AS Candidate_ID  
FROM SEMANTICSIMILARITYTABLE  
    (  
    HumanResources.JobCandidate,  
    Resume,  
    @CandidateID  
    ) AS KEY_TBL  
ORDER BY KEY_TBL.score DESC;  
GO  

Find info about how documents are similar or related with SEMANTICSIMILARITYDETAILSTABLE

To get information about the key phrases that make documents similar or related, you can query the function semanticsimilaritydetailstable (Transact-SQL).

SEMANTICSIMILARITYDETAILSTABLE returns a table of zero, one, or more rows of key phrases common across two documents (a source document and a matched document) whose content is semantically similar. This rowset function can be referenced in the FROM clause of a SELECT statement like a regular table name.

For detailed information about the parameters required by the SEMANTICSIMILARITYDETAILSTABLE function, and about the table of results that it returns, see semanticsimilaritydetailstable (Transact-SQL).

Important

The columns that you target must have full-text and semantic indexing enabled.

Example: Find the top key phrases that are similar between documents

The following example retrieves the 5 key phrases that have the highest similarity score between the specified candidates in HumanResources.JobCandidate table of the AdventureWorks2022 sample database.

SELECT TOP(5) KEY_TBL.keyphrase, KEY_TBL.score  
FROM SEMANTICSIMILARITYDETAILSTABLE  
    (  
    HumanResources.JobCandidate,  
    Resume, @CandidateID,  
    Resume, @MatchedID  
    ) AS KEY_TBL  
ORDER BY KEY_TBL.score DESC;  
GO