Configure and use the Documentum connector (Search Server 2010)

 

Applies to: Search Server 2010

Topic Last Modified: 2013-04-18

Microsoft SharePoint 2010 Indexing Connector for Documentum enables Microsoft SharePoint Server 2010 and Microsoft Search Server 2010 products to index content that is stored in the EMC Documentum system. This article describes how to install and configure the Indexing Connector for Documentum connector for use with Microsoft Search Server 2010.

To download the Indexing Connector for Documentum from the Microsoft Download Center, use the following link: Microsoft SharePoint 2010 Indexing Connector for Documentum (https://go.microsoft.com/fwlink/p/?LinkId=191180&clcid=0x409).

The Indexing Connector for Documentum includes the following features:

  • Based on the SharePoint 2010 Search Connector Framework

  • 64-bit connector

  • One connector supports multiple versions of EMC Documentum Content Server

  • Indexes Documentum objects and object metadata

  • Supports Documentum security definitions and policies

  • Supports Windows PowerShell for automated configuration and administration

  • Configurable search results URL to support multiple Documentum client applications

  • Supports file and folder exclusion for crawling

The following lists describe supported and unsupported object types and properties for the Indexing Connector for Documentum.

Supported container objects and properties:

  • dm_cabinet and subtypes

  • dm_Folder and subtypes

  • r_object_type

  • object_name

  • title

  • subject

  • keywords

  • owner_name

  • r_creator_name

  • r_creation_date

  • r_modifier

  • r_modify_date

  • cabinetpath

  • folderpath

Supported document objects and properties:

  • dm_document

  • authors

  • keywords

  • r_full_content_size

  • r_creation_date

  • object_name

  • r_modify_date

  • r_modifier

  • subject

  • title

  • r_object_type

  • a_content_type

  • owner_name

  • r_version_label

  • r_lock_date

  • r_lock_owner

  • r_policy_id

  • r_current_state

  • log_entry

  • r_creator_name

  • r_access_date

  • a_storage_type

  • i_retain_until

  • ContainerPath

  • All custom properties

Unsupported object types:

  • Temp cabinets

  • Temp folders

  • Temp files

Install and configure the prerequisites for the Indexing Connector for Documentum

Use the following procedure to install and configure the prerequisites for the Indexing Connector for Documentum. The steps are listed in the order that they must be performed.

The SharePoint 2010 Indexing Connector for Documentum has the following software prerequisites:

  • One of the following SharePoint Server 2010, Search Server 2010, or FAST Search Server 2010 for SharePoint products:

    • Microsoft SharePoint Server 2010

    • Microsoft Search Server 2010

    • Microsoft Search Server 2010 Express

    • Microsoft SharePoint 2010 for Internet Sites Enterprise

    • Microsoft SharePoint 2010 for Internet Sites Standard

    • Microsoft FAST Search Server 2010 for SharePoint

    • Microsoft FAST Search Server 2010 for SharePoint Internet Sites

  • You must use DFS Server v6.5 with SP2 with DFS hotfix 1049. This server needs to be configured and connected to all repositories.

  • You must use DFS Productivity Layer v6.5 with SP2 and DFS hotfix 1049 .NET assemblies.

    The .NET assemblies are included in the DFS hotfix 1049 package. You can obtain the DFS hotfix 1049 package, which includes both a server side patch as well as a client side patch, by opening a service request on the EMC Powerlink Web site: http://powerlink.emc.com. Alternatively, you can contact your EMC customer representative.

    The Indexing Connector for Documentum uses EMC DFS (Documentum Foundation Services) as the connectivity application programming interface (API) to access Documentum repositories. Therefore, you must install and configure DFS Productivity Layer (client of DFS Server) .NET components on the Search Server 2010 crawl server where the Indexing Connector for Documentum will be installed.

To install and configure the prerequisites for the Indexing Connector for Documentum

  1. Log on using a user account that is a member of the Administrators group on the computer where the Documentum content access account is created.

  2. Create a Documentum content access account for crawling. The Indexing Connector for Documentum uses a Documentum content access account to retrieve content from the Documentum repository. This account must have the following credentials:

    • At least read permission to documents that you want to crawl.

    • At least browse permission to cabinets, folders, and records (documents with only metadata) that you want to crawl.

  3. On each crawl server, deploy DFS Productivity Layer .NET assemblies to the global assembly cache %windir%\assembly. There are four DLLs that are used by the Indexing Connector for Documentum. Verify the DLL names and versions before you deploy them into the global assembly cache. The following files are included in the DFS1049 Hotfix and when extracted to the default path are located in the following directory: %local%\emc-dfs-sdk-6.5\emc-dfs-sdk-6.5\lib\dotnet:

    • Emc.Documentum.FS.DataModel.Core.dll, version number 6.5.0.231

    • Emc.Documentum.FS.DataModel.Shared.dll, version number 6.5.0.231

    • Emc.Documentum.FS.runtime.dll, version number 6.5.0.231

    • Emc.Documentum.FS.Services.Core.dll, version number 6.5.0.231

    Note

    You can drag and drop the four DLLs into the global assembly cache (%windir%\assembly) to deploy them, but you might have to turn off User Account Control to do this.

  4. In order for the DFS productivity layer .NET assemblies to function correctly, you must update the .NET machine.config file to include WCF settings for the DFS productivity layer. On each crawl server, open the machine.config file located in the following directory: %windir%\Microsoft.NET\Framework64\V2.0.50727\CONFIG. The following WCF settings allow maximum 30 megabytes (MB) per Documentum content object (the document file plus its metadata) transferred. The administrator can increase "maxReceivedMessageSize" in "DfsDefaultService" binding for larger content. The default SharePoint search will handle files with a maximum size of 16 MB. To crawl files larger than 16 MB, follow the optional step.

    1. Go to %windir%\Microsoft.NET\Framework64\v2.0.50727\CONFIG, open the machine.config file, and then add the following XML snippet into the <configuration> element:

      <system.serviceModel>
      <bindings>
      <basicHttpBinding>
      <binding name="DfsAgentService" closeTimeout="00:01:00"
       openTimeout="00:01:00" receiveTimeout="00:10:00" sendTimeout="00:01:00"
       allowCookies="false" bypassProxyOnLocal="false" hostNameComparisonMode="StrongWildcard"
       maxBufferSize="10000000" maxBufferPoolSize="10000000" maxReceivedMessageSize="10000000"
       messageEncoding="Text" textEncoding="utf-8" transferMode="Buffered"
       useDefaultWebProxy="true">
      <readerQuotas maxDepth="32" maxStringContentLength="8192" maxArrayLength="16384"
        maxBytesPerRead="4096" maxNameTableCharCount="16384" />
      <security mode="None">
      <transport clientCredentialType="None" proxyCredentialType="None"
      realm="" />
      <message clientCredentialType="UserName" algorithmSuite="Default" />
      </security>
      </binding>
      
      <binding name="DfsContextRegistryService" closeTimeout="00:01:00"
         openTimeout="00:01:00" receiveTimeout="00:10:00" sendTimeout="00:01:00"
         allowCookies="false" bypassProxyOnLocal="false" hostNameComparisonMode="StrongWildcard"
         maxBufferSize="10000000" maxBufferPoolSize="10000000" maxReceivedMessageSize="10000000"
         messageEncoding="Text" textEncoding="utf-8" transferMode="Buffered"
         useDefaultWebProxy="true">
      <readerQuotas maxDepth="32" maxStringContentLength="8192" maxArrayLength="16384"
      maxBytesPerRead="4096" maxNameTableCharCount="16384" />
      <security mode="None">
      <transport clientCredentialType="None" proxyCredentialType="None"
      realm="" />
      <message clientCredentialType="UserName" algorithmSuite="Default" />
      </security>
      </binding>
      <binding name="DfsDefaultService" closeTimeout="00:01:00" openTimeout="00:10:00" receiveTimeout="00:20:00" sendTimeout="00:10:00" allowCookies="false" bypassProxyOnLocal="false" hostNameComparisonMode="StrongWildcard" maxBufferSize="10000000" maxBufferPoolSize="10000000" maxReceivedMessageSize="30000000" messageEncoding="Text" textEncoding="utf-8" transferMode="StreamedResponse" useDefaultWebProxy="true">
      <readerQuotas maxDepth="32" maxStringContentLength="8192" maxArrayLength="16384" maxBytesPerRead="1048576" maxNameTableCharCount="16384"/>
      <security mode="None">
      <transport clientCredentialType="None" proxyCredentialType="None" realm=""/>
      <message clientCredentialType="UserName" algorithmSuite="Default"/>
      </security>
      </binding>
      </basicHttpBinding>
      </bindings>
      </system.serviceModel>
      
    2. (Optional) To crawl files larger than 16 MB, you have to get the Search service application and store it in a variable, retrieve the current value of "MaxDownloadSize" and modify it to the size that you want to crawl.  

      1. Run this procedure on the crawl server.

      2. On the Start menu, click All Programs.

      3. Click Microsoft SharePoint 2010 Products.

      4. Right-click SharePoint 2010 Management shell.

      5. At the Windows PowerShell command prompt, type the following command(s):

        $ssa = Get-SPEnterpriseSearchServiceApplication
        $ssa.GetProperty("MaxDownloadSize") 
        $ssa.SetProperty("MaxDownloadSize", <File size larger than 16>)
        $ssa.Update()
        

        Where:

        • <File size larger than 16> is the maximum file size that you want to allow the Indexing Connector for Documentum to crawl.
  5. The Indexing Connector for Documentum will crawl the Documentum document Access Control List (ACL) and map this list to the system ACLs. This allows users to search documents that they have permission to read in Documentum. The Indexing Connector for Documentum supports three kinds of ACL translations that you can configure in DCTMConfig.xml by using the following Windows PowerShell cmdlet: Set-SPEnterpriseSearchDCTMConnectorConfig.

    The following list provides Windows PowerShell configuration options for setting up the system ACLs:

    • No SecurityIndexing Connector for Documentum

      The Indexing Connector for Documentum will ignore Documentum ACLs during crawl and every SharePoint user can search all crawled documents.

    • Assume same accountIndexing Connector for Documentum

      When Documentum and SharePoint 2010 Products are both using Active Directory Domain Services (AD DS) or Active Directory directory service, the Indexing Connector for Documentum assumes a user or group is using the same account in both systems.

    • Translate ACL according to user mapping table

      If Documentum and SharePoint 2010 Products are not both using AD DS or Active Directory and you want to enable the security search, you have to set up a user mapping table to specify how to do the ACL translation.

  6. The user mapping table requires the following:

    • User mapping table must be in a Microsoft SQL Server 2008 or later database.

    • The OSearch14 service account must have at least read permission on the user mapping table data.

    DCTMCredentialDomain

    Domain name of a Documentum account. Populate this column when the account comes from the local computer or an LDAP system. The User Source property of the Documentum account should equal None or LDAP, otherwise leave the column empty.

    DCTMCredentialRepository

    Repository name of a Documentum account. Populate this column when the account comes from a Documentum repository.

    DCTMCredentialLogonName

    Logon name of the Documentum account

    NTCredential

    Windows domain user account that searches Documentum contents in SharePoint Server

    Example: A Documentum repository user Dan Park has a logon that is linked to the Finance repository. Dan's Windows domain user account is Litwareinc\dpark. In this case, the user mapping table entry for Dan appears as the following:

    DCTMCredentialDomain

    ""

    DCTMCredentialRepository

    Finance

    DCTMCredentialLogonName

    dpark

    NTCredential

    Litwareinc\dpark

    Note

    If any cells have no value assigned, they cannot be NULL or empty. You must assign the following empty string value: ''.
    For each Documentum group there must be an NT group in the user mapping table and they must both contain the same user information.

    Use the following script to create a user mapping table:

    CREATE TABLE <replace with your user mapping table name>
    (
    DCTMCredentialDomain nvarchar (255) NOT NULL , 
    DCTMCredentialRepository nvarchar (32) NOT NULL , 
    DCTMCredentialLogonName nvarchar (80) NOT NULL , 
    NTCredential nvarchar (255) NOT NULL , 
    CONSTRAINT PK_CredentialMapping PRIMARY KEY CLUSTERED 
    ( DCTMCredentialDomain, DCTMCredentialRepository, DCTMCredentialLogonName )
    ) 
    

    Populate the new mapping table with Documentum/NT Credential pairs as seen in the above table. Grant the OSearch14 account read access to this table.

Install and configure the Indexing Connector for Documentum

Use the following procedure to install and configure the Indexing Connector for Documentum.

To install and configure the Indexing Connector for Documentum

  1. See Add-SPShellAdmin. This article contains information that can help you to verify the permissions that are required to perform this procedure.

  2. Open the Windows PowerShell command console.

  3. On each server in the farm that is running a crawl component, run the Indexing Connector for Documentum DCTMIndexConn.exe. Follow the steps presented in the installation wizard.

  4. On the crawl server, use the following Windows PowerShell cmdlet to register the indexing connector to the Search service applications: New-SPEnterpriseSearchCrawlCustomConnector

  5. Use the following example for a single Search service application: New-SPEnterpriseSearchCrawlCustomConnector -SearchApplication "<name of your Search service application>" -Protocol "dctm" -ModelFilePath <"C:\Program Files\Common Files\Microsoft Shared\Web Server Extensions\14\CONFIG\SearchConnectors\Documentum\MODEL.xml>" -Name "Microsoft SharePoint 2010 Indexing Connector for Documentum

  6. Use the following example for all Search service applications on the farm: Get-SPEnterpriseSearchServiceApplication | New-SPEnterpriseSearchCrawlCustomConnector -Protocol "dctm" -ModelFilePath "C:\Program Files\Common Files\Microsoft Shared\Web Server Extensions\14\CONFIG\SearchConnectors\Documentum\MODEL.xml" -Name "Microsoft SharePoint 2010 Indexing Connector for Documentum"

  7. On each crawl server, set configuration details using the following Windows PowerShell cmdlet: Set-SPEnterpriseSearchDCTMConnectorConfig. All the settings are stored in \Program Files\Common Files\Microsoft Shared\Web Server Extensions\14\CONFIG\SearchConnectors\Documentum\DCTMConfig.xml. If more than one crawl server is used, all settings must be the same on each server.

    Use the following Windows PowerShell commands to display help and examples for the Indexing Connector for Documentum:

    • Get-help Set-SPEnterpriseSearchDCTMConnectorConfig -full shows full help.

    • Get-help Set-SPEnterpriseSearchDCTMConnectorConfig -examples shows only examples.

    The following table describes the important parameters of the Set-SPEnterpriseSearchDCTMConnectorConfig cmdlet.

    ACLTranslation Directs the behavior of the ACL translation

    UserMappingTable

    Default value. The Indexing Connector for Documentum translates the Documentum ACL into a Windows ACL according to the user mapping table. UserMappingTableSQLServer, UserMappingTableSQLInstance, UserMappingTableName and UnMappedAccount take effect only when ACLTranslation is set to "UserMappingTable".

    NoSecurity

    The Indexing Connector for Documentum ignores the Documentum ACL during crawl. For example, all documents from Documentum will be searchable by any SharePoint user. This option enables you to decline enforcement of security trimming or implement custom security trimming.

    SameAccountName

    The Indexing Connector for Documentum assumes Documentum and SharePoint users share the same account, such as a shared account in Active Directory. Once an invalid NT account is found, the Indexing Connector for Documentum discards the account permission.

    UnmappedAccount Defines how to process Documentum accounts which have no corresponding Windows account defined in the user mapping table.

    DiscardACE

    Default value of "UnmappedAccount". The Indexing Connector for Documentum discards a Documentum account when no mapped Windows account is found. If there is any other mapped account for the document, the document will be crawled. If none of the accounts for this document can be mapped, the document will be discarded and an error message will be entered in the crawl log.

    AssumeSameAccount

    Assumes there is a same NT account existed for the Documentum account.

    UserMappingTableSQLServer

    Host name of the computer that is running SQL Server which contains the user mapping table.

    UserMappingTableSQLInstance

    Name of the SQL Server instance that contains the user mapping table.

    UserMappingTableDBName

    Name of the SQL Server database that contains the user mapping table.

    UserMappingTableName

    Name of the user mapping table.

    DisplayURLPatternForDocument

    DisplayURL pattern for documents. Any valid URL with part of the string replaced by placeholders such as <ObjectId>, <RepositoryName> or <Format>. For example, a URL for a document accessible from Documentum Webtop might appear as follows: "http://WebtopMACHINE_NAME:PORT_NUMBER/webtop/drl/objectId/<ObjectId>/format/<Format>".

    DisplayURLPatternForContainer

    DisplayURL pattern for folders and cabinets. Any valid URL with part of the string replaced by placeholders such as <ObjectId>, <RepositoryName> or <Format>. For example, a URL for a folder or cabinet which is accessible from Documentum Webtop might appear as "http://WebtopMACHINE_NAME:PORT_NUMBER/webtop/drl/objectId/<ObjectId>".

    DFSURL

    Specify the DFS Web Services URL for each repository that is to be crawled. More than 1 DFS Web Services URL can be specified for each repository. Use the following format: "RepositoryName1\DFSURL1.1\DFSURL1.2\...\DFSURL1.n\\RepositoryName2\DFSURL2.1\DFSURL2.2\...\DFSURL2.n\..."

    PersistDCTMACL

    Specify whether to store the Documentum ACL in a crawled property. If "PersistDCTMACL" is set to "True", the Indexing Connector for Documentum will store the Documentum ACL information as a crawled property. The default value is "False".

    Example 1: Set to "UserMappingTable" mode.
    Set-SPEnterpriseSearchDCTMConnectorConfig -ACLTranslation "UserMappingTable" -UnmappedAccount "DiscardACE" -UserMappingTableSQLServer "<YourDatabaseServerName>" -UserMappingTableSQLInstance "<YourDatabaseInstanceName>" -UserMappingTableDBName "<YourMappingDatabaseName>"  -UserMappingTableName "<YourMappingTableName>" -DFSURL "RepositoryName1\http://MACHINENAME1:PORT1/services\\RepositoryName2\http://MACHINENAME2:PORT2/services\http://MACHINENAME3:PORT3/services" -DisplayURLPatternForDocument "http://MACHINENAME4:PORT4/webtop/component/drl?objectId={ObjectId}&format={Format}&RepositoryName={RepositoryName}" -DisplayURLPatternForContainer "http://MACHINENAME5:PORT5/webtop/component/drl?objectId={ObjectId}&RepositoryName={RepositoryName}"
    
    Example 2: Set to "NoSecurity" mode.
    Set-SPEnterpriseSearchDCTMConnectorConfig -ACLTranslation "NoSecurity" -DFSURL  "RepositoryName1\http://MACHINENAME1:PORT1/services\\RepositoryName2\http://MACHINENAME2:PORT2/services\http://MACHINENAME3:PORT3/services" -DisplayURLPatternForDocument "http://MACHINENAME4:PORT4/webtop/component/drl?objectId={ObjectId}&format={Format}&RepositoryName={RepositoryName}" -DisplayURLPatternForContainer "http://MACHINENAME5:PORT5/webtop/component/drl?objectId={ObjectId}&RepositoryName={RepositoryName}"
    
    
    Example 3: Set to "SameAccountName" mode.
    Set-SPEnterpriseSearchDCTMConnectorConfig -ACLTranslation "SameAccountName" -DFSURL "RepositoryName1\http://MACHINENAME1:PORT1/services\\RepositoryName2\http://MACHINENAME2:PORT2/services\http://MACHINENAME3:PORT3/services" -DisplayURLPatternForDocument "http://MACHINENAME4:PORT4/webtop/component/drl?objectId={ObjectId}&format={Format}&RepositoryName={RepositoryName}" -DisplayURLPatternForContainer "http://MACHINENAME5:PORT5/webtop/component/drl? objectId={ObjectId}&RepositoryName={RepositoryName}"
    
  8. After setting the configuration details, restart the OSearch14 service on each crawl server.

Create a crawl rule for the Indexing Connector for Documentum

Before a crawl, create crawl rules to include or exclude specific content in Documentum. Use the following procedure to create a crawl rule for the Indexing Connector for Documentum.

To create a crawl rule for the Indexing Connector for Documentum

  1. Verify that the user account that is performing this procedure is an administrator for the Search service application.

  2. Open SharePoint Central Administration, and then click Manage Service Applications.

  3. Click the Search service application where you want to create a crawl rule.

  4. Under Crawling, click Crawl Rules.

  5. On the Manage Crawl Rules page, click New Crawl Rule.

  6. On the Add Crawl Rule page, specify the following information to create at least one crawl rule:

    1. In Path box, type the path of the content that you want to crawl. You can use wildcard "*" or regular expression syntax.

      Because Documentum uses case sensitive names for the content, select the Match case check box.

    2. In Crawl Configuration section, select Include all items in this path, and then select Crawl complex URLs (URLs that contain a question mark - ?).

    3. In the Specify Authentication section, select Specify a different content access account, and then type the Documentum content access account and password that you specified earlier in this article in the appropriate boxes.

    4. Make sure that the Do not allow Basic Authentication check box is cleared.

  7. Click OK to finish configuration.

    Note

    • You can create multiple crawl rules for Documentum to include or exclude Documentum content.

    • You can use different crawl rules to specify different content access accounts for different Documentum content. For example, you have two repositories and two content access accounts for each repository. The Documentum content access account specified in a crawl rule will only be applied to Documentum content covered by the path in that crawl rule.

    The format of the path that you use to refer to a Documentum object is defined in the following table.

    For repository

    dctm://<clientapphostname>/<repository name>

    For cabinet

    dctm://<clientapphostname>/<repository name>/<cabinet name>

    For folder

    dctm://<clientapphostname>/<repository name>/<cabinet name>/<folder name>

    For document

    dctm://<clientapphostname>/<repository name>/<cabinet name>/<folder name>/…/<folder name>?DocSysID=<r_object_id> the r_object_id is the object id of that document.

<clientapphostname> is the host name of your Documentum client application such as Webtop or DA. The <clientapphostname> configured here should be same as that used in content source. <repository name>, <cabinet name>, and <folder name> are case sensitive.

Create a content source for the Indexing Connector for Documentum

Use the following procedure to create a content source.

To create a content source for the Indexing Connector for Documentum

  1. Verify that the user account that is performing this procedure is an administrator for the Search service application.

  2. Open SharePoint Central Administration, and then click Manage Service Applications.

  3. Click the Search service application in which you want to create a content source.

  4. On the Search Administration page, in the Quick Launch, click Content Sources.

  5. On the Manage Content Sources page, click New Content Source.

  6. On the Add Content Source page, do the following:

    1. In the Name box, type the name of the content source.

    2. In the Content Source Type section, select Custom Repository.

    3. In the Type of Repository section, select SharePoint 2010 Indexing Connector for Documentum. Use the name that you specified when you registered the Indexing Connector for Documentum with the Search service application.

    4. In the Start Addresses section, type the start addresses. The start address format is the same as the path pattern. You can type more than one start address for the content source, one per line.

    5. In the Crawl Schedules section, select schedules from the Full Crawl and Incremental Crawl drop-down lists, or create schedules for each kind of crawl.

    6. In the Content Source Priority section, assign a priority level to the content source according to your business requirements.

    7. Select Start full crawl of this content source to start a crawl immediately after the content source is created.

    8. Click OK to finish the configuration and accept all configured options.

    The Documentum content source is configured and the system can crawl Documentum content repositories that are specified in the content source.

Search Server 2010 supports scalable architecture for performance scale-out. You can deploy more than one crawl server and configure multiple crawlers to crawl the EMC Documentum database simultaneously.