Data Flow Taps

You can add a data flow tap to a path that connects two data flow components in an SSIS package (data flow path) and direct the output from the data flow tap to an external file at runtime. This feature allows you to inspect data in the following stages of ETL (Extract-Transform-Load) processing:

  • After the data is extracted from a data source.

  • Before and after the data is transformed.

  • Before the data is loaded into its destination.

Note

You must deploy your SSIS packages by using the project deployment model (not the package deployment model) to take advantage of this feature.

Data flow taps are similar to data viewers in that both allow users (administrators/developers) to view data flowing through data flow paths at runtime. Developers can attach data viewers to data flow paths and view the data flowing through the paths while debugging in SQL Server Data Tools development environment. Data viewers work only in the development environment. Data flow taps allow administrators to add data flow taps to data flow paths to view data flowing through the paths in a testing/production environment.

For example, you can inspect data that the Flat File Source component passes to the OLE DB Destination component in the following data flow by adding a data flow tap to the data flow path between them.

Data Flow Path in an SSIS Package

The following sections describe how to add a data flow tap to a data flow path, list all data flow taps, and remove a data flow tap from a data flow path. You can also watch the Using Data Taps in SSIS video on TechNet Library to learn about adding/removing/listing data flow taps and best practices for using them.

Add a data flow tap

After you deploy your SSIS project using the project deployment model to the Integration Services (SSIS) Catalog, you use the catalog.add_data_tap (or) catalog.add_data_tap_by_guid stored procedures in the SSISDB database to add a data flow tap before starting an execution instance. At present, a data tap can only be added by executing these stored procedures; there is no user interface support for adding data flow taps. However, you can generate statements for creating and starting an execution instance by using the SQL Server Management Studio and then add a add_data_tap (or) add_data_tap_by_guid statement to the generated script. The following procedure provides the detailed steps to add a data tap.

  1. In SQL Server Management Studio, expand Integration Services Catalog, SSISDB, <folder>, Projects, <project>, and Packages. Right-click on <your package>, and click Execute.

    In the following example, ETL Folder is the name of the folder, ETL Project is the name of the project, and Package.dtsx is the name of the package.

    Executing a package using Management Studio

  2. On the Execute Package dialog box, click Script toolbar button to generate the script for creating and starting an execution instance for the selected package. The SQL script should be copied to a new Query Editor window.

    Execute Package - Script

  3. Click Cancel to close the Execute Package dialog box. You should see the generated SQL script in the new Query Editor window. The catalog.create_execution (SSISDB Database) stored procedure call in the script creates an execution instance of the selected package and the catalog.start_execution (SSISDB Database) stored procedure call starts the execution instance.

    You can remove any additional SQL statements from the generated script (such as set_execution_parameter_value). The script should look as shown below:

    Declare @exec_id bigint
    EXEC [SSISDB].[catalog].[ create_execution] @package_name=N'Package.dtsx', @execution_id=@exec_id OUTPUT, @folder_name=N'ETL Folder', @project_name=N'ETL Project', @use32bitruntime=False, @reference_id=Null
    EXEC [SSISDB].[catalog].[ start_execution] @exec_id
    GO
    
  4. In the Query Editor, add a data flow tap to the execution instance of the package by using either catalog.add_data_tap or catalog.add_data_tap_by_guid stored procedure before the start_execution call.

    EXEC [SSISDB].[catalog]. add_data_tap @execution_id = @exec_id, @task_package_path = '\Package\Data Flow Task', @dataflow_path_id_string = 'Paths[Flat File Source.Flat File Source Output]', @data_filename = 'output.txt'
    

    (OR)

    EXEC [SSISDB].[catalog]. add_data_tap_by_guid @execution_id = @exec_id, @dataflow_task_guid = '{6DAB81C6-722D-4DF5-BB47-82DBC201693F}', @dataflow_path_id_string = 'Paths[Flat File Source.Flat File Source Output]', @data_filename = 'output.txt'
    
    1. For the first parameter, execution_id, pass the value that was returned by the create_execution stored procedure.

    2. The second parameter for the add_data_tap stored procedure is task_package_path and for the add_data_tap_by_guid stored procedure is dataflow_task_guid.

      1. If you are using the add_data_tap stored procedure:

        In SQL Server Data Tools, in the control flow of the SSIS package, right-click the Data Flow Task, and click Properties to launch the Properties window. Note the value of the PackagePath property.

        Pass the value of the PackagePath property for the task_package_path parameter of the add_data_tap stored procedure.

        Package Path setting on an SSIS package

      2. If you are using the add_data_tap_by_guid stored procedure:

        In SQL Server Data Tools, in the control flow of the SSIS package, right-click the Data Flow Task, and click Properties to launch the Properties window (See the picture above for an example). Note the value of the ID property.

        Pass the value of ID property of the data flow task for the dataflow_task_guid parameter of the add_data_tap_by_guid stored procedure.

    3. Click the data flow path to which you want to add a data flow tap (arrow between tasks in the data flow) in the data flow of the SSIS package in SQL Server Data Tools, and note the value of the IdentificationString property in the Properties window

      Pass the value of the IdentificationString property for the dataflow_path_id_string parameter of the stored procedure.

      IdentificationString properties of a data flow pat

    4. Specify the name of the output file for the data_filename parameter. After you start the execution of the package in the next step, you can watch for files being generated in <Program Files>\Microsoft SQL Server\110\DTS\DataDumps folder.

Here is the complete SQL script that you should run against the SSISDB database. This example uses the add_data_tap stored procedure to add a data tap:

Declare @exec_id bigint
EXEC [SSISDB].[catalog].[create_execution] @package_name=N'Package.dtsx', @execution_id=@exec_id OUTPUT, @folder_name=N'ETL Folder', @project_name=N'ETL Project', @use32bitruntime=False, @reference_id=Null
EXEC [SSISDB].[catalog].add_data_tap @execution_id = @exec_id, @task_package_path = '\Package\Data Flow Task', @dataflow_path_id_string = 'Paths[Flat File Source.Flat File Source Output]', @data_filename = 'output.txt'
EXEC [SSISDB].[catalog].[start_execution] @exec_id
GO

When you execute the script, the SSIS package runs, and the data flowing through the data flow path to which data flow tap has been added is saved to the output.txt file in <Program Files>\Microsoft SQL Server\110\DTS\DataDumps. If a file with the name already exists, a new file with a suffix (for example: output[1].txt) is created.

Note

The data flow tap you add is specific to the execution instance whose ID is passed as an argument to the add_data_tap or add_data_tap_by_guid stored procedure.

Warning

Enabling verbose logging level and adding data flow taps increase the I/O operations performed by your data integration solution. Hence, we recommend that you add data flow taps only for troubleshooting purposes.

Remove a data flow tap

You can remove a data flow tap before starting the execution by using the catalog.remove_data_tap stored procedure. This stored procedure takes the ID of data flow tap as a parameter, which you can get as an output of the add_data_tap stored procedure.

DECLARE @tap_id bigint
EXEC [SSISDB].[catalog].add_data_tap @execution_id = @exec_id, @task_package_path = '\Package\Data Flow Task', @dataflow_path_id_string = 'Paths[Flat File Source.Flat File Source Output]', @data_filename = 'output.txt' @data_tap_id=@tap_id OUTPUT
EXEC [SSISDB].[catalog].remove_data_tap @tap_id

List all data flow taps

You can also list all the data flow taps by using the catalog.execution_data_taps view. The following example extracts data flow taps for a specification execution instance (ID: 54).

select * from [SSISDB].[catalog].execution_data_taps where execution_id=@exec_id

Results from listing data taps

Debugging Data Flow

Troubleshooting Tools for Package Execution

Integration Services icon (small) Stay Up to Date with Integration Services

For the latest downloads, articles, samples, and videos from Microsoft, as well as selected solutions from the community, visit the Integration Services page on MSDN:


For automatic notification of these updates, subscribe to the RSS feeds available on the page.