Creating a Source with the Script Component

You use a source component in the data flow of an Integration Services package to load data from a data source to pass on to downstream transformations and destinations. Ordinarily you connect to the data source through an existing connection manager.

For an overview of the Script component, see Extending the Data Flow with the Script Component.

The Script component and the infrastructure code that it generates for you simplify significantly the process of developing a custom data flow component. However, to understand how the Script component works, you may find it useful to read through the steps that are involved in developing a custom data flow component. See the section Developing a Custom Data Flow Component, especially the topic Developing a Custom Source Component.

Getting Started with a Source Component

When you add a Script component to the Data Flow pane of SSIS Designer, the Select Script Component Type dialog box opens and prompts you to select a Source, Destination, or Transformation script. In this dialog box, select Source.

Configuring a Source Component in Metadata-Design Mode

After choosing to create a source component, you configure the component by using the Script Transformation Editor. For more information, see Configuring the Script Component in the Script Component Editor.

A data flow source component has no inputs and supports one or more outputs. Configuring the outputs for the component is one of the steps that you must complete in metadata design mode, by using the Script Transformation Editor, before you write your custom script.

Adding Connection Managers

Ordinarily a source component uses an existing connection manager to connect to the data source from which it loads data into the data flow. On the Connection Managers page of the Script Transformation Editor, click Add to add the appropriate connection manager.

However, a connection manager is only a convenient unit that encapsulates and stores the information that it needs to connect to a data source of a particular type. You must write your own custom code to load or save your data, and possibly to open and close the connection to the data source also.

For general information about using connection managers with the Script component, see Connecting to Data Sources in the Script Component.

For more information about the Connection Managers page of the Script Transformation Editor, see Script Transformation Editor (Connection Managers Page).

Configuring Outputs and Output Columns

A source component has no inputs and supports one or more outputs. On the Inputs and Outputs page of the Script Transformation Editor, a single output has been created by default, but no output columns have been created. On this page of the editor, you may need or want to configure the following items.

  • You must add and configure output columns manually for each output. Select the Output Columns folder for each output, and then use the Add Column and Remove Column buttons to manage the output columns for each output of the source component. Later, you will refer to the output columns in your script by the names that you assign here, by using the typed accessor properties created for you in the auto-generated code.

  • You may want to create one or more additional outputs, such as a simulated error output for rows that contain unexpected values. Use the Add Output and Remove Output buttons to manage the outputs of the source component. All input rows are directed to all available outputs unless you also specify an identical non-zero value for the ExclusionGroup property of those outputs where you intend to direct each row to only one of the outputs that share the same ExclusionGroup value. The particular integer value selected to identify the ExclusionGroup is not significant.

    Note

    You can also use a non-zero ExclusionGroup property value with a single output when you do not want to output all rows. In this case, however, you must explicitly call the DirectRowTo<outputbuffer> method for each row that you want to send to the output.

  • You may want to assign a friendly name to the outputs. Later, you will refer to the outputs by their names in your script, by using the typed accessor properties created for you in the auto-generated code.

  • Ordinarily multiple outputs in the same ExclusionGroup have the same output columns. However, if you are creating a simulated error output, you may want to add more columns to store error information. For information about how the data flow engine processes error rows, see Using Error Outputs in a Data Flow Component. In the Script component, however, you must write your own code to fill the additional columns with appropriate error information. For more information, see Simulating an Error Output for the Script Component.

For more information about the Inputs and Outputs page of the Script Transformation Editor, see Script Transformation Editor (Inputs and Outputs Page).

Adding Variables

If there are any existing variables whose values you want to use in your script, you can add them in the ReadOnlyVariables and ReadWriteVariables property fields on the Script page of the Script Transformation Editor. To add multiple variables, separate the variable names by commas.

For general information about how to use variables with the Script component, see Using Variables in the Script Component.

For more information about the Script page of the Script Transformation Editor, see Script Transformation Editor (Script Page).

Scripting a Source Component in Code-Design Mode

After you configure all the metadata for your component, open the Visual Studio for Applications (VSA) IDE to code your custom script by using the Visual Basic .NET language. You can open the VSA IDE from the Script page of the Script Transformation Editor by clicking Design Script.

For important information that applies to all types of components created by using the Script component, see Coding and Debugging the Script Component.

Understanding the Auto-generated Code

When you open the VSA IDE after creating and configuring a source component, the editable ScriptMain class appears in the code editor with a stub for the CreateNewOutputRows method. The ScriptMain class is where you will write your custom code, and CreateNewOutputRows is the most important method in a source component.

If you open the VSA Project Explorer window, you can see that the Script component has also generated read-only BufferWrapper and ComponentWrapper project items. The ScriptMain class inherits from UserComponent class in the ComponentWrapper project item.

At run time, the data flow engine invokes the PrimeOutput method in the UserComponent class, which overrides the PrimeOutput method of the ScriptComponent parent class. The PrimeOutput method in turn calls the following methods:

  1. The CreateNewOutputRows method, which you override in ScriptMain to add rows from the data source to the output buffers, which are empty at first.
  2. The FinishOutputs method, which is empty by default. Override this method in ScriptMain to perform any processing required to complete the output.
  3. The private MarkOutputsAsFinished method, which calls the SetEndOfRowset method of the ScriptBuffer parent class to indicate to the data flow engine that the output is complete. You do not have to call SetEndOfRowset explicitly in your own code.

Writing Your Custom Code

To finish creating a custom source component, you may want to write script in the following methods available in the ScriptMain class.

  1. Override the AcquireConnections method to connect to the external data source. Extract the connection object, or the required connection information, from the connection manager.
  2. Override the PreExecute method to load data, if you can load all the source data at the same time. For example, you can execute a SqlCommand against an ADO.NET connection to a SQL Server database and load all the source data at the same time into a SqlDataReader. If you must load the source data one row at a time—for example, when reading a text file—you can load the data as you loop through rows in CreateNewOutputRows.
  3. Use the overridden CreateNewOutputRows method to add new rows to the empty output buffers and to fill in the values of each column in the new output rows. Use the AddRow method of each output buffer to add an empty new row, and then set the values of each column. Typically you copy values from the columns loaded from the external source.
  4. Override the PostExecute method to finish processing the data. For example, you can close the SqlDataReader that you used to load data.
  5. Override the ReleaseConnections method to disconnect from the external data source, if required.

Examples

The following examples demonstrate the custom code that is required in the ScriptMain class to create a source component.

Note

These examples use the Person.Address table in the AdventureWorks sample database and pass its first and fourth columns, the intAddressID and nvarchar(30)City columns, through the data flow. The same data is used in the source, transformation, and destination samples in this section. Additional prerequisites and assumptions are documented for each example.

ADO.NET Source Example

This example demonstrates a source component that uses an existing ADO.NET connection manager to load data from a SQL Server table into the data flow.

If you want to run this sample code, you must configure the package and the component as follows:

  1. Create an ADO.NET connection manager that uses the SqlClient provider to connect to the AdventureWorks database.

  2. Add a new Script component to the Data Flow designer surface and configure it as a source.

  3. Open the Script Transformation Editor. On the Inputs and Outputs page, rename the default output with a more descriptive name such as MyAddressOutput, and add and configure the two output columns, AddressID and City.

  4. On the Script page, click Design Script and enter the script that follows. Then close the script development environment.

  5. On the Connection Managers page, add or create the ADO.NET connection manager and give it a descriptive name such as MyADONETConnection. Then close the Script Transformation Editor.

  6. Create and configure a destination component, such as a SQL Server destination, or the sample destination component demonstrated in Creating a Destination with the Script Component, that expects the AddressID and City columns. Then connect the source component to the destination. (You can connect a source directly to a destination without any transformations.) You can create a destination table by running the following Transact-SQL command in the AdventureWorks database:

    CREATE TABLE [Person].[Address2](
        [AddressID] [int] NOT NULL,
        [City] [nvarchar](30) NOT NULL
    )
    
  7. Run the sample.

Imports System.Data.SqlClient
...
Public Class ScriptMain
    Inherits UserComponent

    Dim connMgr As IDTSConnectionManager90
    Dim sqlConn As SqlConnection
    Dim sqlReader As SqlDataReader

    Public Overrides Sub AcquireConnections(ByVal Transaction As Object)

        connMgr = Me.Connections.MyADONETConnection
        sqlConn = CType(connMgr.AcquireConnection(Nothing), SqlConnection)

    End Sub

    Public Overrides Sub PreExecute()

        Dim cmd As New SqlCommand("SELECT AddressID, City, StateProvinceID FROM Person.Address", sqlConn)
        sqlReader = cmd.ExecuteReader

    End Sub

    Public Overrides Sub CreateNewOutputRows()

        Do While sqlReader.Read
            With MyAddressOutputBuffer
                .AddRow()
                .AddressID = sqlReader.GetInt32(0)
                .City = sqlReader.GetString(1)
            End With
        Loop

    End Sub

    Public Overrides Sub PostExecute()

        sqlReader.Close()

    End Sub

    Public Overrides Sub ReleaseConnections()

        connMgr.ReleaseConnection(sqlConn)

    End Sub

End Class

Flat File Source Example

This example demonstrates a source component that uses an existing Flat File connection manager to load data from a flat file into the data flow. The flat file source data is created by exporting it from SQL Server.

If you want to run this sample code, you must configure the package and the component as follows:

  1. Use the SQL Server Import and Export Wizard to export the Person.Address table from the AdventureWorks sample database to a comma-delimited flat file. This sample uses the file name ExportedAddresses.txt.

  2. Create a Flat File connection manager that connects to the exported data file.

  3. Add a new Script component to the Data Flow designer surface and configure it as a source.

  4. Open the Script Transformation Editor. On the Inputs and Outputs page, rename the default output with a more descriptive name such as MyAddressOutput. Add and configure the two output columns, AddressID and City.

  5. On the Script page, click Design Script and enter the script that follows. Then close the script development environment.

  6. On the Connection Managers page, add or create the Flat File connection manager, using a descriptive name such as MyFlatFileSrcConnectionManager. Then close the Script Transformation Editor.

  7. Create and configure a destination component, such as a SQL Server destination, or the sample destination component demonstrated in Creating a Destination with the Script Component. Then connect the source component to the destination. (You can connect a source directly to a destination without any transformations.) You can create a destination table by running the following Transact-SQL command in the AdventureWorks database:

    CREATE TABLE [Person].[Address2](
        [AddressID] [int] NOT NULL,
        [City] [nvarchar](30) NOT NULL
    )
    
  8. Run the sample.

Imports System.IO
...
Public Class ScriptMain
    Inherits UserComponent

    Private textReader As StreamReader

    Public Overrides Sub AcquireConnections(ByVal Transaction As Object)

        Dim connMgr As IDTSConnectionManager90 = _
            Me.Connections.MyFlatFileSrcConnectionManager
        Dim exportedAddressFile As String = _
            CType(connMgr.AcquireConnection(Nothing), String)
        textReader = New StreamReader(exportedAddressFile)

    End Sub

    Public Overrides Sub CreateNewOutputRows()

        Dim nextLine As String
        Dim columns As String()

        Dim delimiters As Char()
        delimiters = ",".ToCharArray

        nextLine = textReader.ReadLine
        Do While nextLine IsNot Nothing
            columns = nextLine.Split(delimiters)
            With MyAddressOutputBuffer
                .AddRow()
                .AddressID = columns(0)
                .City = columns(3)
            End With
            nextLine = textReader.ReadLine
        Loop

    End Sub

    Public Overrides Sub ReleaseConnections()

        textReader.Close()

    End Sub

End Class

Change History

Release History

17 July 2006

Changed content:
  • Rearranged sequence of steps in examples and revised for easier understanding.

See Also

Concepts

Creating a Destination with the Script Component
Developing a Custom Source Component

Help and Information

Getting SQL Server 2005 Assistance