Working with Excel Files with the Script Task

New: 14 April 2006

SQL Server 2005 Integration Services (SSIS) provides the Excel connection manager, Excel source, and Excel destination for working with data stored in spreadsheets in the Microsoft Excel file format. The techniques described in this topic use the Script task to obtain information about available Excel databases (workbook files) and tables (worksheets and named ranges). These samples can easily be modified to work with any of the other file-based data sources supported by the Microsoft Jet OLE DB Provider.

Configuring a Package to Test the Samples

Example1: Check Whether an Excel File Exists

Example 2: Check Whether an Excel Table Exists

Example 3: Get a List of Excel Files in a Folder

Example 4: Get a List of Tables in an Excel File

Displaying the Results of the Samples

Note

If you want to create a task that you can more easily reuse across multiple packages, consider using the code in this Script task sample as the starting point for a custom task. For more information, see Developing a Custom Task.

Configuring a Package to Test the Samples

You can configure a single package to test all the samples in this topic. The samples use many of the same package variables and the same .NET Framework classes.

To configure a package for use with the examples in this topic

  1. Create a new Integration Services project in Business Intelligence Development Studio and open the default package for editing.

  2. Variables. Open the Variables window and define the following variables:

    • ExcelFile, of type String. Enter the complete path and filename to an existing Excel workbook.
    • ExcelTable, of type String. Enter the name of an existing worksheet or named range in the workbook named in the value of the ExcelFile variable. This value is case-sensitive.
    • ExcelFileExists, of type Boolean.
    • ExcelTableExists, of type Boolean.
    • ExcelFolder, of type String. Enter the complete path of a folder that contains at least one Excel workbook.
    • ExcelFiles, of type Object.
    • ExcelTables, of type Object.
  3. Imports statements. Most of the code samples require you to import one or both of the following .NET Framework namespaces at the top of your script file:

    • System.IO, for file system operations.
    • System.Data.OleDb, to open Excel files as data sources.
  4. References. The code samples that read schema information from Excel files require an additional reference in the script project to the System.Xml namespace.

Example 1 Description: Check Whether an Excel File Exists

This example determines whether the Excel workbook file specified in the ExcelFile variable exists, and then sets the Boolean value of the ExcelFileExists variable to the result. You can use this Boolean value for branching in the workflow of the package.

To configure this Script Task example

  1. Add a new Script task to the package and change its name to ExcelFileExists.

  2. Open the Script Task Editor.

  3. On the Script tab, add ExcelFile to the ReadOnlyVariables list, and add ExcelFileExists to the ReadWriteVariables list.

  4. Click Design Script to open the script editor.

  5. Add an Imports statement for the System.IO namespace at the top of the script file.

  6. Replace all the code in Sub Main with the following code.

Example 1 Code

    Dim fileToTest As String

    fileToTest = Dts.Variables("ExcelFile").Value.ToString
    If File.Exists(fileToTest) Then
      Dts.Variables("ExcelFileExists").Value = True
    Else
      Dts.Variables("ExcelFileExists").Value = False
    End If

    Dts.TaskResult = Dts.Results.Success

Example 2 Description: Check Whether an Excel Table Exists

This example determines whether the Excel worksheet or named range specified in the ExcelTable variable exists in the Excel workbook file specified in the ExcelFile variable, and then sets the Boolean value of the ExcelTableExists variable to the result. You can use this Boolean value for branching in the workflow of the package.

To configure this Script Task example

  1. Add a new Script task to the package and change its name to ExcelTableExists.

  2. Open the Script Task Editor.

  3. On the Script tab, add ExcelTable and ExcelFile to the ReadOnlyVariables list, and add ExcelTableExists to the ReadWriteVariables list.

  4. Click Design Script to open the script editor.

  5. Add a reference to the System.Xml assembly in the script project.

  6. Add Imports statements for the System.IO and System.Data.OleDb namespaces at the top of the script file.

  7. Replace all the code in Sub Main with the following code.

Example 2 Code

    Dim fileToTest As String
    Dim tableToTest As String
    Dim connectionString As String
    Dim excelConnection As OleDbConnection
    Dim excelTables As DataTable
    Dim excelTable As DataRow
    Dim currentTable As String

    fileToTest = Dts.Variables("ExcelFile").Value.ToString
    tableToTest = Dts.Variables("ExcelTable").Value.ToString

    Dts.Variables("ExcelTableExists").Value = False
    If File.Exists(fileToTest) Then
      connectionString = "Provider=Microsoft.Jet.OLEDB.4.0;" & _
        "Data Source=" & fileToTest & _
        ";Extended Properties=Excel 8.0"
      excelConnection = New OleDbConnection(connectionString)
      excelConnection.Open()
      excelTables = excelConnection.GetSchema("Tables")
      For Each excelTable In excelTables.Rows
        currentTable = excelTable.Item("TABLE_NAME").ToString
        If currentTable = tableToTest Then
          Dts.Variables("ExcelTableExists").Value = True
        End If
      Next
    End If

    Dts.TaskResult = Dts.Results.Success

Example 3 Description: Get a List of Excel Files in a Folder

This example fills an array with the list of Excel files found in the folder specified in the value of the ExcelFolder variable, and then copies the array into the ExcelFiles variable. You can use the Foreach from Variable enumerator to iterate over the files in the array.

To configure this Script Task example

  1. Add a new Script task to the package and change its name to GetExcelFiles.

  2. Open the Script Task Editor.

  3. On the Script tab, add ExcelFolder to the ReadOnlyVariables list, and add ExcelFiles to the ReadWriteVariables list.

  4. Click Design Script to open the script editor.

  5. Add an Imports statement for the System.IO namespace at the top of the script file.

  6. Replace all the code in Sub Main with the following code.

Example 3 Code

    Const FILE_PATTERN As String = "*.xls"

    Dim excelFolder As String
    Dim excelFiles As String()

    excelFolder = Dts.Variables("ExcelFolder").Value.ToString
    excelFiles = Directory.GetFiles(excelFolder, FILE_PATTERN)

    Dts.Variables("ExcelFiles").Value = excelFiles

    Dts.TaskResult = Dts.Results.Success

Alternate Solution

Instead of using a Script task to gather a list of Excel files into an array, you can also use the ForEach File enumerator to iterate over all the Excel files in a folder. For more information, see How to: Loop through Excel Files and Tables.

Example 4 Description: Get a List of Tables in an Excel File

This example fills an array with the list of worksheets and named ranges found in the Excel workbook file specified by the value of the ExcelFile variable, and then copies the array into the ExcelTables variable. You can use the Foreach from Variable Enumerator to iterate over the tables in the array.

Note

The list of tables in an Excel workbook includes both worksheets (which have the $ suffix) and named ranges. If you have to filter the list for only worksheets or only named ranges, you may have to add additional code for this purpose.

To configure this Script Task example

  1. Add a new Script task to the package and change its name to GetExcelTables.

  2. Open the Script Task Editor.

  3. On the Script tab, add ExcelFile to the ReadOnlyVariables list, and add ExcelTables to the ReadWriteVariables list.

  4. Click Design Script to open the script editor.

  5. Add a reference to the System.Xml namespace in the script project.

  6. Add an Imports statement for the System.Data.OleDb namespace at the top of the script file.

  7. Replace all the code in Sub Main with the following code.

Example 4 Code

    Dim excelFile As String
    Dim connectionString As String
    Dim excelConnection As OleDbConnection
    Dim tablesInFile As DataTable
    Dim tableCount As Integer = 0
    Dim tableInFile As DataRow
    Dim currentTable As String
    Dim tableIndex As Integer = 0

    Dim excelTables As String()

    excelFile = Dts.Variables("ExcelFile").Value.ToString
    connectionString = "Provider=Microsoft.Jet.OLEDB.4.0;" & _
        "Data Source=" & excelFile & _
        ";Extended Properties=Excel 8.0"
    excelConnection = New OleDbConnection(connectionString)
    excelConnection.Open()
    tablesInFile = excelConnection.GetSchema("Tables")
    tableCount = tablesInFile.Rows.Count
    ReDim excelTables(tableCount - 1)
    For Each tableInFile In tablesInFile.Rows
      currentTable = tableInFile.Item("TABLE_NAME").ToString
      excelTables(tableIndex) = currentTable
      tableIndex += 1
    Next

    Dts.Variables("ExcelTables").Value = excelTables

    Dts.TaskResult = Dts.Results.Success

Alternate Solution

Instead of using a Script task to gather a list of Excel tables into an array, you can also use the ForEach ADO.NET Schema Rowset Enumerator to iterate over all the tables (that is, worksheets and named ranges) in an Excel workbook file. For more information, see How to: Loop through Excel Files and Tables.

Displaying the Results of the Samples

If you have configured each of the examples in this topic in the same package, you can connect all the Script tasks to an additional Script task that displays the output of all the examples.

To configure a Script task to display the output of the examples in this topic

  1. Add a new Script task to the package and change its name to DisplayResults.

  2. Connect each of the four example Script tasks to one another, so that each task runs after the preceding task completes successfully, and connect the fourth example task to the DisplayResults task.

  3. Open the DisplayResults task in the Script Task Editor.

  4. On the Script tab, add all seven variables listed above in " Configuring a Package to Test the Samples" to the ReadOnlyVariables list.

  5. Click Design Script to open the script editor.

  6. Add Imports statements for the Microsoft.VisualBasic and System.Windows.Forms namespaces at the top of the script file.

  7. Replace all the code in Sub Main with the following code.

  8. Run the package and examine the results displayed in a message box.

Code to Display the Results

    Const EOL As String = ControlChars.CrLf

    Dim results As String
    Dim filesInFolder As String()
    Dim fileInFolder As String
    Dim tablesInFile As String()
    Dim tableInFile As String

    results = _
      "Final values of variables:" & EOL & _
      "ExcelFile: " & Dts.Variables("ExcelFile").Value.ToString & EOL & _
      "ExcelFileExists: " & Dts.Variables("ExcelFileExists").Value.ToString & EOL & _
      "ExcelTable: " & Dts.Variables("ExcelTable").Value.ToString & EOL & _
      "ExcelTableExists: " & Dts.Variables("ExcelTableExists").Value.ToString & EOL & _
      "ExcelFolder: " & Dts.Variables("ExcelFolder").Value.ToString & EOL & _
      EOL

    results &= "Excel files in folder: " & EOL
    filesInFolder = DirectCast(Dts.Variables("ExcelFiles").Value, String())
    For Each fileInFolder In filesInFolder
      results &= " " & fileInFolder & EOL
    Next
    results &= EOL

    results &= "Excel tables in file: " & EOL
    tablesInFile = DirectCast(Dts.Variables("ExcelTables").Value, String())
    For Each tableInFile In tablesInFile
      results &= " " & tableInFile & EOL
    Next

    MessageBox.Show(results, "Results", MessageBoxButtons.OK, MessageBoxIcon.Information)

    Dts.TaskResult = Dts.Results.Success

See Also

Other Resources

Excel Connection Manager
How to: Loop through Excel Files and Tables

Help and Information

Getting SQL Server 2005 Assistance