Export (0) Print
Expand All
Creating Audit Tables, Invoking COM Objects, and More
Exception-handling Techniques
Exploring SQL Server Triggers: Part 2
Deliver User-Friendly Reports from Your Application with SQL Server Reporting Services
SQL Server: Display Your Data Your Way with Custom Renderers for Reporting Services
Updating Data in Linked Servers, Information Schema Views, and More
XML Features in SQL Server 2000
Expand Minimize

Data Transformation Services (DTS) in SQL Server 2000

By Diane Larsen

Euan Garden, contributor

September 2000

Summary: Database administrators often import, export, and transform data in support of tasks such as data consolidation, archiving, and analysis; for application development purposes; and for database or server upgrades. Data Transformation Services (DTS) in SQL Server 2000 provides a set of graphical tools and programmable objects to help administrators and developers solve data movement problems, including the extraction, transformation, and consolidation of data from disparate sources to single or multiple destinations. Sets of tasks, workflow operations, and constraints can be collected as DTS packages that can be scheduled to run periodically or when certain events occur. This white paper introduces DTS, shows some of the components and services that can be used to create DTS solutions, illustrates the use of DTS Designer to implement DTS solutions, and introduces DTS application development.

On This Page

Introduction to DTS
Using DTS Designer
Options for Saving DTS Packages
DTS as an Application Development Platform

Introduction to DTS

Most organizations have multiple formats and locations in which data is stored. To support decision-making, improve system performance, or upgrade existing systems, data often must be moved from one data storage location to another.

Microsoft® SQL Server™ 2000 Data Transformation Services (DTS) provides a set of tools that lets you extract, transform, and consolidate data from disparate sources into single or multiple destinations. By using DTS tools, you can create custom data movement solutions tailored to the specialized needs of your organization, as shown in the following scenarios:

  • You have deployed a database application on an older version of SQL Server or another platform, such as Microsoft Access. A new version of your application requires SQL Server 2000, and requires you to change your database schema and convert some data types.

  • To copy and transform your data, you can build a DTS solution that copies database objects from the original data source into a SQL Server 2000 database, while at the same time remapping columns and changing data types. You can run this solution using DTS tools, or you can embed the solution within your application.

  • You must consolidate several key Microsoft Excel spreadsheets into a SQL Server database. Several departments create the spreadsheets at the end of the month, but there is no set schedule for completion of all the spreadsheets.

  • To consolidate the spreadsheet data, you can build a DTS solution that runs when a message is sent to a message queue. The message triggers DTS to extract data from the spreadsheet, perform any defined transformations, and load the data into a SQL Server database.

  • Your data warehouse contains historical data about your business operations, and you use Microsoft SQL Server 2000 Analysis Services to summarize the data. Your data warehouse needs to be updated nightly from your Online Transaction Processing (OLTP) database. Your OLTP system is in-use 24-hours a day, and performance is critical.

    You can build a DTS solution that uses the file transfer protocol (FTP) to move data files onto a local drive, loads the data into a fact table, and aggregates the data using Analysis Services. You can schedule the DTS solution to run every night, and you can use the new DTS logging options to track how long this process takes, allowing you to analyze performance over time.

What Is DTS?

DTS is a set of tools you can use to import, export, and transform heterogeneous data between one or more data sources, such as Microsoft SQL Server, Microsoft Excel, or Microsoft Access. Connectivity is provided through OLE DB, an open-standard for data access. ODBC (Open Database Connectivity) data sources are supported through the OLE DB Provider for ODBC.

You create a DTS solution as one or more packages. Each package may contain an organized set of tasks that define work to be performed, transformations on data and objects, workflow constraints that define task execution, and connections to data sources and destinations. DTS packages also provide services, such as logging package execution details, controlling transactions, and handling global variables.

These tools are available for creating and executing DTS packages:

  • The Import/Export Wizard is for building relatively simple DTS packages, and supports data migration and simple transformations.

  • The DTS Designer graphically implements the DTS object model, allowing you to create DTS packages with a wide range of functionality.

  • DTSRun is a command-prompt utility used to execute existing DTS packages.

  • DTSRunUI is a graphical interface to DTSRun, which also allows the passing of global variables and the generation of command lines.

  • SQLAgent is not a DTS application; however, it is used by DTS to schedule package execution.

Using the DTS object model, you also can create and run packages programmatically, build custom tasks, and build custom transformations.

What's New in DTS?

Microsoft SQL Server 2000 introduces several DTS enhancements and new features:

  • New DTS tasks include the FTP task, the Execute Package task, the Dynamic Properties task, and the Message Queue task.

  • Enhanced logging saves information for each package execution, allowing you to maintain a complete execution history and view information for each process within a task. You can generate exception files, which contain rows of data that could not be processed due to errors.

  • You can save DTS packages as Microsoft Visual Basic® files.

  • A new multiphase data pump allows advanced users to customize the operation of data transformations at various stages. Also, you can use global variables as input parameters for queries.

  • You can use parameterized source queries in DTS transformation tasks and the Execute SQL task.

  • You can use the Execute Package task to dynamically assign the values of global variables from a parent package to a child package.

Using DTS Designer

DTS Designer graphically implements the DTS object model, allowing you to graphically create DTS packages. You can use DTS Designer to:

  • Create a simple package containing one or more steps.

  • Create a package that includes complex workflows that include multiple steps using conditional logic, event-driven code, or multiple connections to data sources.

  • Edit an existing package.

The DTS Designer interface consists of a work area for building packages, toolbars containing package elements that you can drag onto the design sheet, and menus containing workflows and package management commands.

Cc917688.dtssql01(en-us,TechNet.10).gif

Figure 1: DTS Designer interface

By dragging connections and tasks onto the design sheet, and specifying the order of execution with workflows, you can easily build powerful DTS packages using DTS Designer. The following sections define tasks, workflows, connections, and transformations, and illustrate the ease of using DTS Designer to implement a DTS solution.

Tasks: Defining Steps in a Package

A DTS package usually includes one or more tasks. Each task defines a work item that may be performed during package execution. You can use tasks to:

  • Transform data

     

    dtssql02

    Transform Data task

    Use to move data between a source and destination and to optionally apply column-level transformations to the data.

     

    dtssql03

    Data Driven Query task

    Use to perform flexible, Transact-SQL–based operations on data, including stored procedures and INSERT, UPDATE, or DELETE statements.

     

    Parallel Data Pump task(1)

    Available programmatically only, the Parallel Data Pump task performs the same functions as the Transform Data and Data Driven Query tasks, but supports chaptered rowsets as defined by OLE DB 2.5 and later.

  • Copy and manage data

     

    dtssql04

    Bulk Insert task

    Use to quickly load large amounts of data into a SQL Server table or view.

     

    dtssql05

    Execute SQL task

    Use to run SQL statements during package execution. The Execute SQL task also can save data that is the result of a query.

     

    dtssql06

    Copy SQL Server Objects task

    Use to copy SQL Server objects from one installation or instance of SQL Server to another. You can copy objects such as data and tables, as well as the definitions of objects such as views and stored procedures.

     

    dtssql07

    Transfer Database task(1)

    Use to move or copy a SQL Server database from an instance of SQL Server version 7.0 or SQL Server 2000 to an instance of SQL Server 2000.

     

    dtssql08

    Transfer Error Messages task(1)

    Use to copy user-specified error messages, created by the sp_addmessage system stored procedure, from an instance of SQL Server 7.0 or SQL Server 2000 to an instance of SQL Server 2000.

     

    dtssql09

    Transfer Logins task(1)

    Use to copy logins from an instance of SQL Server 7.0 or SQL Server 2000 to an instance of SQL Server 2000.

     

    dtssql10

    Transfer Jobs task(1)

    Use to copy jobs from an instance of SQL Server 7.0 or SQL Server 2000 to an instance of SQL Server 2000.

     

    dtssql11

    Transfer Master Stored Procedures task(1)

    Use to copy stored procedures from a master database on an instance of SQL Server 7.0 or SQL Server 2000 to the master database on an instance of SQL Server 2000.

  • Run tasks as jobs from within a package

     

    dtssql12

    ActiveX Script task

    Use to write code to perform functions that are not available in the other DTS tasks.

     

    dtssql13

    Dynamic Properties task(1)

    Use to retrieve values from sources outside a DTS package at package run time and assign those values to selected package properties.

     

    dtssql14

    Execute Package task(1)

    Use to run other DTS packages from within a package.

     

    dtssql15

    Execute Process task

    Use to run an executable program or batch file.

     

    dtssql16

    File Transfer Protocol (FTP) task(1)

    Use to download data files from a remote server or an Internet location.

     

    dtssql17

    Message Queue task(1)

    Use to send and receive messages from Microsoft Message Queues.

     

    dtssql18

    Send Mail task

    Use to send an e-mail message.

     

    dtssql19

    Analysis Services Processing task(2)

    Use to perform processing of one or more objects defined in SQL Server 2000 Analysis Services.

     

    dtssql20

    Data Mining task(1,2)

    Use to create a prediction query and an output table from a data mining model object defined in SQL Server 2000 Analysis Services.

1 New in SQL Server 2000.

2 Available only when SQL Server 2000 Analysis Services is installed.

You also can create custom tasks programmatically, and then integrate them into DTS Designer using the Register Custom Task command.

To illustrate the use of tasks, here is a simple DTS Package with two tasks: a Microsoft ActiveX® Script task and a Send Mail task:

Figure 2: DTS Package with two tasks

Figure 2: DTS Package with two tasks

The ActiveX Script task can host any ActiveX Scripting engine including Microsoft Visual Basic Scripting Edition (VBScript), Microsoft JScript®, or ActiveState ActivePerl, which you can download from http://www.activestate.com . The Send Mail task may send a message indicating that the package has run. Note that there is no order to these tasks yet. When the package executes, the ActiveX Script task and the Send Mail task run concurrently.

Workflows: Setting Task Precedence

When you define a group of tasks, there is usually an order in which the tasks should be performed. When tasks have an order, each task becomes a step of a process. In DTS Designer, you manipulate tasks on the DTS Designer design sheet and use precedence constraints to control the sequence in which the tasks execute.

Precedence constraints sequentially link tasks in a package. The following table shows the types of precedence constraints you can use in DTS.

Precedence constraint

Description

dtssql22
On Completion
(blue arrow)

If you want Task 2 to wait until Task 1 completes, regardless of the outcome, link Task 1 to Task 2 with an On Completion precedence constraint.

dtssql23
On Success
(green arrow)

If you want Task 2 to wait until Task 1 has successfully completed, link Task 1 to Task 2 with an On Success precedence constraint.

dtssql24
On Failure
(red arrow)

If you want Task 2 to begin execution only if Task 1 fails to execute successfully, link Task 1 to Task 2 with an On Failure precedence constraint.

The following illustration shows the ActiveX Script task and the Send Mail task with an On Completion precedence constraint. When the Active X Script task completes, with either success or failure, the Send Mail task runs.

Figure 3: ActiveX Script task and the Send Mail task with an On Completion precedence constraint

Figure 3: ActiveX Script task and the Send Mail task with an On Completion precedence constraint

You can configure separate Send Mail tasks, one for an On Success constraint and one for an On Failure constraint. The two Send Mail tasks can send different messages based on the success or failure of the ActiveX script.

Figure 4: Mail tasks

Figure 4: Mail tasks

You also can issue multiple precedence constraints on a task. For example, the Send Mail task "Admin Notification" could have both an On Success constraint from Script #1 and an On Failure constraint from Script #2. In these situations, DTS assumes a logical "AND" relationship. Therefore, Script #1 must successfully execute and Script #2 must fail for the Admin Notification message to be sent.

Figure 5: Example of multiple precedence constraints on a task

Figure 5: Example of multiple precedence constraints on a task

Connections: Accessing and Moving Data

To successfully execute DTS tasks that copy and transform data, a DTS package must establish valid connections to its source and destination data and to any additional data sources, such as lookup tables.

When creating a package, you configure connections by selecting a connection type from a list of available OLE DB providers and ODBC drivers. The types of connections that are available are:

  • Microsoft Data Access Components (MDAC) drivers

     

    dtssql28

    Microsoft OLE DB Provider for SQL Server

     

    dtssql29

    Microsoft Data Link

     

    dtssql30

    Microsoft ODBC Driver for Oracle

  • Microsoft Jet drivers

     

    dtssql31

    dBase 5

     

    dtssql32

    Microsoft Access

     

    dtssql33

    HTML File (Source)

     

    dtssql34

    Microsoft Excel 97-2000

     

    dtssql35

    Paradox 5.X

  • Other drivers

     

    dtssql36

    Text File (Source)

     

    dtssql37

    Text File (Destination)

     

    dtssql38

    Other Connection

DTS allows you to use any OLE DB connection. The icons on the Connections toolbar provide easy access to common connections.

The following illustration shows a package with two connections. Data is being copied from an Access database (the source connection) into a SQL Server production database (the destination connection).

Cc917688.dtssql39(en-us,TechNet.10).gif

Figure 6: Example of a package with two connections

The first step in this package is an Execute SQL task, which checks to see if the destination table already exists. If so, the table is dropped and re-created. On the success of the Execute SQL task, data is copied to the SQL Server database in Step 2. If the copy operation fails, an e-mail is sent in Step 3.

The Data Pump: Transforming Data

The DTS data pump is a DTS object that drives the import, export, and transformation of data. The data pump is used during the execution of the Transform Data, Data Driven Query, and Parallel Data Pump tasks. These tasks work by creating rowsets on the source and destination connections, then creating an instance of the data pump to move rows between the source and destination. Transformations occur on each row as the row is copied.

In the following illustration, a Transform Data task is used between the Access DB task and the SQL Production DB task in Step 2. The Transform Data task is the gray arrow between the connections.

Cc917688.dtssql40(en-us,TechNet.10).gif

Figure 7: Example of a Transform Data task

To define the data gathered from the source connection, you can build a query for the transformation tasks. DTS supports parameterized queries, which allow you to define query values when the query is executed.

You can type a query into the task's Properties dialog box, or use the Data Transformation Services Query Designer, a tool for graphically building queries for DTS tasks. In the following illustration, the Query Designer is used to build a query that joins three tables in the pubs database.

Cc917688.dtssql41(en-us,TechNet.10).gif

Figure 8: Data Transformation Services Query Designer interface

In the transformation tasks, you also define any changes to be made to data. The following table describes the built-in transformations that DTS provides.

Transformation

Description

Copy Column

Use to copy data directly from source to destination columns, without any transformations applied to the data.

ActiveX Script

Use to build custom transformations. Note that since the transformation occurs on a row-by-row basis, an ActiveX script can affect the execution speed of a DTS package.

DateTime String

Use to convert a date or time in a source column to a different format in the destination column.

Lowercase String

Use to convert a source column to lowercase characters and, if necessary, to the destination data type.

Uppercase String

Use to convert a source column to all uppercase characters and, if necessary, to the destination data type.

Middle of String

Use to extract a substring from the source column, transform it, and copy the result to the destination column.

Trim String

Use to remove leading, trailing, and embedded white space from a string in the source column and copy the result to the destination column.

Read File

Use to open the contents of a file, whose name is specified in a source column, and copy the contents into a destination column.

Write File

Use to copy the contents of a source column (data column) to a file whose path is specified by a second source column (file name column).

You can also create your own custom transformations programmatically. The quickest way to build custom transformations is to use the Active Template Library (ATL) custom transformation template, which is included in the SQL Server 2000 DTS sample programs.

Data Pump Error Logging

A new method of logging transformation errors is available in SQL Server 2000. You can define three exception log files for use during package execution: an error text file, a source error rows file, and a destination error rows file.

  • General error information is written to the error text file.

  • If a transformation fails, then the source row is in error, and that row is written to the source error rows file.

  • If an insert fails, then the destination row is in error, and that row is written to the destination error rows file.

The exception log files are defined in the tasks that transform data. Each transformation task has its own log files.

Data pump phases

By default, the data pump has one phase: row transformation. That phase is what you configure when mapping column-level transformations in the Transform Data task, Data Driven Query task, and Parallel Data Pump task, without selecting a phase.

Multiple data pump phases are new in SQL Server 2000. By selecting the multiphase data pump option in SQL Server Enterprise Manager, you can access the data pump at several points during its operation and add functionality.

When copying a row of data from source to a destination, the data pump follows the basic process shown in the following illustration.

Cc917688.dtssql42(en-us,TechNet.10).gif

Figure 9: . Data pump process

After the data pump processes the last row of data, the task is finished and the data pump operation terminates.

Advanced users who want to add functionality to a package so that it supports any data pump phase can do so by:

  • Writing an ActiveX script phase function for each data pump phase to be customized. If you use ActiveX script functions to customize data pump phases, no additional code outside of the package is required.

  • Creating a COM object in Microsoft Visual C++® to customize selected data pump phases. You develop this program external to the package, and the program is called for each selected phase of the transformation. Unlike the ActiveX script method of accessing data pump phases, which uses a different function and entry point for each selected phase, this method provides a single entry point that is called by multiple data pump phases, while the data pump task executes.

Options for Saving DTS Packages

These options are available for saving DTS packages:

  • Microsoft SQL Server

    Save your DTS package to Microsoft SQL Server if you want to store packages on any instance of SQL Server on your network, keep a convenient inventory of those packages, and add and delete package versions during the package development process.

  • SQL Server 2000 Meta Data Services

    Save your DTS package to Meta Data Services if you plan to track package version, meta data, and data lineage information.

  • Structured storage file

    Save your DTS package to a structured storage file if you want to copy, move, and send a package across the network without having to store the package in a Microsoft SQL Server database.

  • Microsoft Visual Basic

    Save your DTS package that has been created by DTS Designer or the DTS Import/Export Wizard to a Microsoft Visual Basic file if you want to incorporated it into Visual Basic programs or use it as a prototype for DTS application development.

DTS as an Application Development Platform

The DTS Designer provides a wide variety of solutions to data movement tasks. DTS extends the number of solutions available by providing programmatic access to the DTS object model. Using Microsoft Visual Basic, Microsoft Visual C++, or any other application development system that supports COM, you can develop a custom DTS solution for your environment using functionality unsupported in the graphical tools.

DTS offers support for the developer in several different ways:

  • Building packages

    You can develop extremely complex packages and access the full range of functionality in the object model, without the using the DTS Designer or DTS Import/Export Wizard.

  • Extending packages

    You can add new functionality through the construction of custom tasks and transforms, customized for your business and reusable within DTS.

  • Executing packages

    Execution of DTS packages does not have to be from any of the tools provided, it is possible to execute DTS packages programmatically and display progress through COM events, allowing the construction of embedded or custom DTS execution environments.

Sample DTS programs are available to help you get started with DTS programming. The samples can be installed with SQL Server 2000.

If you develop a DTS application, you can redistribute the DTS files. For more information, see Redist.txt on the SQL Server 2000 compact disc.

For More Information

Microsoft SQL Server 2000 Books Online contains more information about DTS, using the DTS applications, and building custom solutions. For additional information, see these resources:

Was this page helpful?
(1500 characters remaining)
Thank you for your feedback
Show:
© 2014 Microsoft