Integration Services Development Enhancements
Microsoft SQL Server 2005 Integration Services (SSIS) introduces new features and enhancements that increase the power and productivity of developers, administrators, and knowledge workers who develop data transformation solutions:
Graphical tools such as SSIS Designer and the SQL Server Import and Export Wizard.
The ability to create packages programmatically and to extend the Integration Services object model by using custom tasks, sources, destinations, and transformations.
The previous architecture of SQL Server Data Transformation Services (DTS) combined data transformation, tasks, and package control flow into a single component. This made the creation of complex packages difficult. In SQL Server 2005, the Integration Services architecture separates data flow from control flow by introducing two distinct engines—the Integration Services run-time engine and the Integration Services data flow engine. This separation provides better control of package execution, increases the visibility of data transformations, and enhances the extensibility of Integration Services by simplifying the creation and implementation of custom tasks and transformations.
For more information, see Integration Services Architecture.
Integration Services Run-Time Engine
The Integration Services run-time engine stores package layout, executes packages, controls workflow between tasks, and provides run-time services such as debugging, logging, event handling, and management of connections, variables, and transactions.
For more information, see Control Flow Elements.
Integration Services Data Flow Engine
The Integration Services data flow engine satisfies the needs of enterprises whose extraction, transformation, and loading (ETL) processes require fast, flexible, extensible, and dependable data movement. It is optimized for high-performance data movement and transformation.
The data flow engine supports multiple sources, multiple transformations, and multiple destinations in one fast, flexible data flow. Integration Services includes more than 25 transformations and more than 10 sources and destinations for use in data flows. The Data Flow task, which represents the Integration Services data flow engine graphically in SSIS Designer, replaces various DTS data-oriented tasks, such as the Data Transformation task and the Data Driven Query task.
For more information, see Data Flow Elements.
In addition to the separation of data flow and control flow, the entire underlying Integration Services object model has been reengineered with extensibility in mind. A powerful integrated development environment (IDE) is combined with support for the Microsoft .NET Framework to accelerate the creation of powerful custom Integration Services tasks, transformations, and data adapters.
The custom extensions that can be developed include tasks, log providers, enumerators, connection managers, and data flow components. These custom objects can be integrated into the user interface of Business Intelligence Development Studio.
The Integration Services run-time API and data flow API let developers extend and customize almost every aspect of the object model. New or existing Integration Services packages can be loaded, modified, and executed programmatically, giving developers the ability to fully automate package maintenance and execution.
For more information, see Integration Services Programming.
For some examples of custom Integration Services development projects, see Programming Samples.
The new SSIS Designer provides an integrated developer experience for designing, creating, testing, and debugging Integration Services packages. The user interface of SSIS Designer lets you build and configure packages by using drag-and-drop methods and by selecting options in dialog boxes for each package object.
SSIS Designer includes these features:
Separate design surfaces for package control flow, data flow, and event handlers.
A hierarchical view of package content in Solution Explorer.
Containers that can be expanded and collapsed for grouping related tasks, allowing easier viewing, organization, and management of package layout.
Annotations that make package control flow, data flows, and event handlers self-documenting.
A connections area for adding connection managers to packages and for referencing data source objects.
Dialog boxes for adding custom variables, configuring logging, creating configurations, and signing packages with digital signatures.
Debugging tools that provide the ability to set breakpoints on packages, containers, and task events, and data viewers for watching data as it moves through the data flow.
A progress window that lists the start time of a package and its tasks, their execution state, and any warnings and error messages.
A graphical representation of package execution including the progress, precedence, and execution outcome of individual tasks, containers, and data flow components.
SQL Server 2005 introduces Business Intelligence Development Studio for building data transformation solutions, and SQL Server Management Studio for managing Integration Services packages. Business Intelligence Development Studio hosts SSIS Designer, the graphical tool for creating Integration Services packages, and provides all the powerful features of the Microsoft Visual Studio development environment to the Integration Services package developer.
Business Intelligence Development Studio integrates seamlessly with the Visual Studio for Applications (VSA) environment, in which the developer writes scripts for the Script task and the Script component. When debugging a package that includes both breakpoints on package events that you set in SSIS Designer, and breakpoints on lines of script code in a Script task that you set in VSA, the package can run to and from breakpoints in the package and the script.
|Breakpoints are not supported in the Script component.|
For more information, see Introducing Business Intelligence Development Studio.
For more information, see SQL Server Management Studio and Business Intelligence Development Studio.
The redesign of the SQL Server Import and Export Wizard puts the focus on copying data. This wizard is the simplest way to quickly create Integration Services packages that copy data between two data stores.
The SQL Server Import and Export Wizard includes many new features, including better support for data in flat files and real-time preview of data. Saved packages created by using the SQL Server Import and Export Wizard can be opened in Business Intelligence Development Studio and extended by using SSIS Designer.
For more information, see Creating Packages Using the SQL Server Import and Export Wizard.
Many new workflow features, including containers for looping, event handlers, and enhanced precedence constraints, give the package developer more precise control over package execution.
Workflow containers provide structure to packages and services to tasks. They support repeating control flows in packages, and they group tasks and containers into meaningful units of work. Integration Services provides the following new containers:
The Sequence container, for grouping tasks and other workflow structures into a unit of work that can be managed as one item. For more information, see Sequence Container.
The For Loop container, for grouping tasks and other workflow structures into a unit of work that is repeated by evaluating an expression. For more information, see For Loop Container.
The Foreach Loop container, for grouping tasks and other workflow structures into a unit of work that is repeated by enumerating objects. For more information, see Foreach Loop Container.
The Integration Services object model supports a nested hierarchy of containers in a package control flow. Workflow containers can also include other containers, providing support for complex package workflow.
For more information, see Integration Services Containers.
New and enhanced tasks make it simple to build packages that solve complex business problems by using only the tasks that Integration Services provides.
For more information, see Integration Services Tasks.
Integration Services includes a number of tasks that perform workflow operations, such as executing other packages, running applications, and sending e-mail messages. Integration Services also includes these new workflow tasks:
The WMI Data Reader task, for querying Windows Management Instrumentation (WMI) data.
The WMI Event Watcher task, for listening to WMI events.
Data Preparation Tasks
Several other new tasks provide data preparation functionality, such as uploading, downloading, and copying files. New data preparation tasks include the following:
The File System task, for performing operations on files and folders in the file system.
The Web Service task, for accessing Web services.
The XML task, for working with XML documents.
Business Intelligence Tasks
Integration Services now includes a number of tasks that perform business intelligence operations, such as processing analytic objects, running scripts written in the Analysis Services data definition language (DDL), and executing data mining prediction queries. Integration Services includes these new business intelligence tasks:
The Analysis Services Execute DDL task for running DDL scripts.
The Data Mining Query task for querying data mining models.
Integration Services includes two tasks that can contain scripts: the Script task, and the ActiveX Script task for legacy support. These scripts run in the package workflow to extend package functionality beyond the functionality provided by the built-in Integration Services tasks.
In SQL Server 2005, Integration Services packages can extract data from many additional types of data sources. To access these data sources, Integration Services provides the following new sources and destinations in addition to the SQL Server, OLE DB, and flat file sources and destinations:
SQL Server 2005 Compact Edition destination, for inserting and updating data into SQL Server 2005 Compact Edition databases.
DataReader source and destination, for consuming and providing data to any .NET Framework data provider.
XML source, for exacting data from XML documents.
Raw File source and destination, for reading and writing raw data to files.
Recordset destination, for creating and populating an in-memory ADODB recordset.
Data Mining Model Training, Data Mining Query, Partition Processing, and Dimension Processing destinations, for working with analytic objects such as mining models, cubes, and dimensions.
Integration Services also includes the Script Component for simplified development of custom sources and destinations.
The Integration Services data flow engine supports data flows that have multiple sources, multiple transformations, and multiple destinations. New transformations make it easy for developers to build packages that have complex data flow without writing any code. These transformations include the following:
Conditional Split and Multicast transformation, for distributing data rows to multiple downstream data flow components.
Union All, Merge, and Merge Join transformations, for combining data rows from multiple upstream data flow components.
Sort transformation, for sorting data and identifying duplicate data rows, and the Fuzzy Grouping transformation for identifying similar data rows.
Lookup and Fuzzy Lookup transformations, for extending data with values from a lookup table.
Term Extraction and Term Lookup transformations, for text mining applications.
Aggregate, Pivot, Unpivot, and Slowly Changing Dimension transformations, for common data warehousing tasks.
Percentage Sampling and Row Sampling transformations, for extracting a sample rowset.
Copy Column, Data Conversion, and Derived Column transformations, for copying and modifying column values, and the Aggregate transformation for summarizing data.
Pivot and Unpivot transformations, for creating normalized data rows from non-normalized data and vice versa.
Integration Services also includes the Script Component for simplified development of custom transformations.
For more information, see Integration Services Transformations.