Remove Duplicates with UI Component Sample

The Remove Duplicates sample demonstrates the implementation of a data flow transformation component that has asynchronous outputs. Components that have asynchronous outputs receive an input PipelineBuffer and output PipelineBuffer that correspond to the input and output of the object, respectively. The input buffers contain rows provided by upstream components. The output buffer is empty and is filled by the component, typically by using the rows from the input buffer, during a call to the ProcessInput method. After all the rows have been received, they are sorted, and then the distinct rows are sent to one output and the duplicate rows to the other. This sample is not supported on Itanium-based operating systems.

Important

The Integration Services Data Flow Programming code samples are intended to demonstrate the core functionality that you must implement to create a custom data flow component. The samples do not include full support for customization in the Advanced Editor. For example, you cannot use the Advanced Editor to add or remove inputs and outputs or to configure columns. Samples are provided for educational purposes only. They are not intended to be used in a production environment and have not been tested in a production environment. Microsoft does not provide technical support for these samples.

Running the Sample

If you already know how to locate, build, and install code samples, you can go directly to the section Testing the Sample, and read about how to configure and run the code sample.

Prerequisites

This sample requires that the following components be installed.

  • Microsoft Visual Studio 2005
  • Microsoft SQL Server 2005 Integration Services

Location

If the code samples were installed to the default installation location, the C# version of the code sample is located in the following folder:

C:\Program Files\Microsoft SQL Server\100\Samples\Integration Services\\Programming Samples\Data Flow\RemoveDuplicatesWithUI Component Sample\CS

Note

The version of the RemoveDuplicates sample that includes a custom user interface is provided in the C# programming language only.

For information about the two-step process required to install the samples, see Installing Samples. To obtain the latest version of the samples, including new samples released after the original release of SQL Server 2005, see SQL Server 2005 Samples and Sample Databases (April 2006).

Building the Sample

If you have not already generated a strong name key file in the Samples folder, use the following procedure to generate this key file. The sample projects are configured to sign assemblies at build time with this key file. You can view the signing properties on the Signing tab of the Project Properties dialog box.

To generate a strong name key file

  1. To open a Microsoft Visual Studio 2005 command prompt, click Start, point to All Programs, point to Microsoft Visual Studio 2005, point to Visual Studio Tools, and then click Visual Studio 2005 Command Prompt.

    - or -

    To open a Microsoft .NET Framework command prompt, click Start, point to All Programs, point to Microsoft .NET Framework SDK 2.0, and then click SDK Command Prompt.

  2. At the command prompt, use the change directory (CD) command to change the current folder of the Command Prompt window to the Samples folder. The key file that you create in this folder will be used by all SQL Server 2005 code samples.

    Note

    To determine the folder where samples are located, click Start, point to All Programs, point to Microsoft SQL Server 2005, point to Documentation and Tutorials, and then click Samples Directory. If the default installation location was used, the samples are located in <system_drive>:\Program Files\Microsoft SQL Server\90\Samples.

  3. At the command prompt, run the following command to generate the key file:

    sn -k SampleKey.snk
    

    Important

    For more information about the strong-name key pair, see "Security Briefs: Strong Names and Security in the .NET Framework" in the .NET Development Center on MSDN.

  4. You will need the public key token from the key file in a subsequent step. To obtain the public key token, first extract the public key from the key file to a new file by running the following command at the command prompt:

    sn -p SampleKey.snk SampleKeyPublic.snk
    

    Now display the public key token from the new file by running the following command at the command prompt:

    sn -t SampleKeyPublic.snk
    
  5. Copy the public key token to the clipboard or save it for later use.

To build the sample in Microsoft Visual Studio 2005

  1. From the File | Open menu, click Project and open the RemoveDuplicatesWithUI.sln.

  2. Locate the DtsPipelineComponent attribute before the class declaration in the RemoveDuplicates.cs file or the RemoveDuplicates.vb file and replace the alphanumeric value of the Public Key Token in the UITypeName property of the attribute with the public key token obtained earlier from the key file.

  3. From the Build menu, click Build RemoveDuplicatesWithUI to build the solution.

Installing the Sample

This sample is provided in C# only. After successfully building the component, follow these steps in order to add it to a Data Flow task in Business Intelligence Development Studio.

To copy the component to the PipelineComponents folder

  1. Open Windows Explorer or your preferred application for working in the file system.

  2. Copy the assembly (RemoveDuplicatesWithUICS.dll) to the PipelineComponents folder located at %system%\Program Files\Microsoft SQL Server\90\DTS.

To install the component into the global assembly cache (GAC) by dragging the assembly

  1. Open Windows Explorer or your preferred application for working in the file system.

  2. Drag the assembly from the PipelineComponents folder to the folder where the global assembly cache (GAC) is located, at %system%\assembly.

To install the component into the global assembly cache (GAC) by using gacutil.exe

  1. Open a Command Prompt window.

  2. Type the following command to run gacutil.exe and install the C# version of the component into the GAC:

    gacutil.exe -iF "c:\Program Files\Microsoft Sql Server\90\DTS\PipelineComponents\RemoveDuplicatesWithUICS.dll "

To add the component to the Toolbox

  1. Open Business Intelligence Development Studio.

  2. Right-click the toolbox and then click Choose Items.

  3. In the Choose Toolbox Items dialog box, click the SSIS Data Flow Items tab.

  4. Click the check box next to your component, and then click OK.

    Note

    If the component is not displayed in the list, you can click Browse to locate the component yourself. However in this case it may not be installed correctly.

After you finish these steps, the component is visible in the Data Flow Items tab of the Toolbox, and can be added to the Data Flow task in SSIS Designer.

Testing the Sample

After the component is added to a Data Flow task in a package and connected to a component that will provide rows to it, you can configure it as follows in SSIS Designer.

To configure the sample component in a package

  1. Select the columns to be used by the component in the component's custom editor. Only the selected columns are passed to the next component in the data flow. The contents of each column are evaluated to determine whether a row matches other rows.