Export (0) Print
Expand All

Create and deploy custom entity extractors in SharePoint Server 2013

SharePoint 2013
 

Applies to: SharePoint Server 2013

Topic Last Modified: 2013-12-18

Summary: Learn to create custom entity extractors and how to use them to set up custom refiners. Create one or more custom entity extraction dictionaries and connect them to managed properties.

You create and maintain the custom entity extractor file in a system external to SharePoint 2013 before you import it into SharePoint 2013 to make the custom entity extractor available to the search system.

To use custom entities as refiners, you first create a custom entity extraction dictionary and deploy it. Then, you configure a managed property to use a custom entity extractor and run a full crawl. After that, you can configure the Refinement Web Part on the search results page to use the custom entity as a refiner.

In this article:

NoteNote:
SharePoint 2013 runs as websites in Internet Information Services (IIS), and administrators and users therefore require the accessibility features that browsers provide. For more information, see:

Before you begin this operation, review the following information about prerequisites:

  • Create a Search service application

  • Add one or more content sources and run a full crawl

  • Configure a search results page

To create a custom entity extraction dictionary
  1. Determine which type of custom entity extraction dictionary you want to create: Word, Word Part, Word exact or Word Part exact. See Overview of custom entity extractor types.

  2. Create a .csv file with the columns Key and Display Form. Make sure you use a comma as the column separator. If the file contains non-ASCII characters such as diacritics, you must encode it in UTF-8. Save the file to a location that is accessible from the server from which you will run the Windows PowerShell cmdlet to deploy the custom entity extraction dictionary.

    1. In the Key column, enter the term (single or multiple words) that you want to include as custom entities. You can use more than one line per key. Make sure there are no leading or trailing spaces around the terms.

    2. (Optional) In the Display form column, enter a refiner name. If you leave this column empty, the term that is extracted from the content will be displayed as the refiner in the same case as it occurs in the content. Use the Display Form column to control and standardize the way in which the refiner is displayed.

For example, an organization named Contoso has a certification system with three levels: Contoso Beginner, Contoso Professional and Contoso Expert. Contoso wants to extract these entities and wants to be able to refine on all of them. Regardless of the case in which the word "Contoso", "beginner", "professional" or "expert" is written, they want to display the refiner as Contoso Beginner, Contoso Professional and Contoso Expert. For this example, the custom entity extraction dictionary file input could look like this:

Key,Display form
Contoso Beginner,Contoso Beginner
Contoso B1,Contoso Beginner
Contoso Professional,Contoso Professional
Contoso prof,Contoso Professional
Contoso Expert,Contoso Expert

To deploy the custom entity extraction dictionary, you must import it into SharePoint 2013.

To import a custom entity extraction dictionary
  1. Verify that the user account that is importing the custom entity extractor dictionary is an administrator for the Search service application.

  2. Start the SharePoint 2013 Management Shell.

    • For Windows Server 2008 R2:

      • On the Start menu, click All Programs, click Microsoft SharePoint 2013 Products, and then click SharePoint 2013 Management Shell.

    • For Windows Server 2012:

      1. On the Start screen, click SharePoint 2013 Management Shell.

        If SharePoint 2013 Management Shell is not on the Start screen:

      2. Right-click Computer, click All apps, and then click SharePoint 2013 Management Shell.

    For more information about how to interact with Windows Server 2012, see Common Management Tasks and Navigation in Windows Server 2012.

  3. At the Windows PowerShell command prompt, type the following command:

    $searchApp = Get-SPEnterpriseSearchServiceApplication
    Import-SPEnterpriseSearchCustomExtractionDictionary -SearchApplication $searchApp -Filename <Path> -DictionaryName <Dictionary name> 
    
    

    Where:

    • <Path> specifies the full UNC path of the .csv file (the custom extraction dictionary) to be imported.

    • <Dictionary name> is the name of the type of the custom extraction dictionary.

      Depending on which type of dictionary you are importing, enter one of the following:

      • Microsoft.UserDictionaries.EntityExtraction.Custom.Word.n [where n = 1,2,3,4 or 5]

      • Microsoft.UserDictionaries.EntityExtraction.Custom.ExactWord.1

      • Microsoft.UserDictionaries.EntityExtraction.Custom.WordPart.n [where n = 1,2,3,4 or 5]

      • Microsoft.UserDictionaries.EntityExtraction.Custom.ExactWordPart.1

The following procedure describes how to associate the custom entity extraction dictionary with an existing managed property from which you want to extract custom entities. Typically, this is a managed property that you expect to contain these entities, such as the managed properties Title or Body. Custom entities are extracted from the full contents of the managed property they are associated with, even if sections in those contents are tagged as <no index>.

To specify from which existing managed property custom entities should be extracted, you edit the existing managed property. For more information about managing crawled and managed properties, see Manage the search schema in SharePoint Server 2013.

To edit a managed property for custom entity extraction
  1. Verify that the user account is the administrator of the Search service application.

  2. In Central Administration, in the Application Management section, click Manage service applications.

  3. Click the Search service application.

  4. On the Search Administration page, in the Quick Launch, under Queries and Results, click Search Schema.

  5. On the Managed Properties page, find the managed property that you want to associate the custom entity extraction dictionary with that contains the single or multiple words (or word parts). You can also enter the name of the managed property in the Filter box.

  6. Point to the managed property, click the arrow and then click Edit/Map property.

  7. On the Edit Managed Property page, edit the settings under Custom entity extraction. Select the custom entity extraction dictionary that you have imported, and then click OK.

After the next full crawl has completed, the custom entity extractor is enabled. The original managed property content is saved unchanged in the search index. In addition, depending on the type of custom entity extractor you have enabled, the extracted entities are copied to one or more of the following managed properties:

  • WordCustomRefiner1, WordCustomRefiner2, WordCustomRefiner3, WordCustomRefiner4, WordCustomRefiner5

  • WordExactCustomRefiner

  • WordPartCustomRefiner1, WordPartCustomRefiner2, WordPartCustomRefiner3. WordPartCustomRefiner4, WordPartCustomRefiner5

  • WordPartExactCustomRefiner

These managed properties are automatically configured to be searchable, queryable, retrievable, sortable and refinable.

You can use the extracted custom entities as refiners in the search results page. The refiners based on the custom entities are available in the Refinement Web Part.

To add a refiner based on a custom entity extractor
  1. Verify that the user account that performs this procedure is a member of the Designers SharePoint group on the Enterprise Search Center site.

  2. Browse to the page that contains the Refinement Web Part that you want to configure, click the Settings menu and then click Edit Page.

  3. Edit the Refinement Web Part. Click the Refinement Web Part Menu arrow, and then click Edit Web Part.

    1. In the Web Part tool pane, in the Properties for Search Refinement section, verify that the Choose Refiners in this Web Part is selected.

    2. Click Choose Refiners.

    3. On the Refinement configuration page, from the Available refiners section, use the buttons to select one or more managed properties containing extracted entities that you want to show as refiners from the list and click Add. For example, if you have deployed a word extraction dictionary, choose WordCustomRefiner1.

    4. In the Configure for section, configure how you want each refiner to appear.

  4. Click OK.

The following table shows what type of custom extraction dictionaries you can create and how the dictionary entries are matched with content in the search index, which dictionary name you should use when you deploy the dictionary and which managed property will contain the extracted entities..

 

Custom entity extractor / custom entity extractor dictionary Description Example Dictionary name to use in Windows PowerShell Managed property that will contain the extracted entity

Word Extraction

Case-insensitive, dictionary entries matching tokenized content, maximum 5 dictionaries.

The entry "anchor" matches "anchor" and "Anchor," but not "anchorage"

Microsoft.UserDictionaries.EntityExtraction.Custom.Word.n

[where n = 1,2,3,4 or 5]

WordCustomRefiner1

WordCustomRefiner2

WordCustomRefiner3

WordCustomRefiner4

WordCustomRefiner5

Word Part Extraction

Case-insensitive, dictionary entries matching un-tokenized content, maximum 5 dictionaries.

The entry "anchor" matches "anchor," "Anchor" and "anchorage"

Microsoft.UserDictionaries.EntityExtraction.Custom.WordPart.n

[where n = 1,2,3,4 or 5]

WordPartCustomRefiner1

WordPartCustomRefiner2

WordPartCustomRefiner3

WordPartCustomRefiner4

WordPartCustomRefiner5

Word Exact Extraction

Case-sensitive, dictionary entries matching tokenized content, maximum 1 dictionary.

The entry "anchor" matches "anchor," but not "Anchor" or "Anchorage"

Microsoft.UserDictionaries.EntityExtraction.Custom.ExactWord.1

WordExactCustomRefiner

Word Part Exact Extraction

Case-sensitive, dictionary entries matching un-tokenized content, maximum 1 dictionary.

The entry "anchor" matches "anchor" and "anchorage," but not "Anchor"

Microsoft.UserDictionaries.EntityExtraction.Custom.ExactWordPart.1

WordPartExactCustomRefiner

Was this page helpful?
(1500 characters remaining)
Thank you for your feedback
Show:
© 2014 Microsoft