Improving Incremental Loads with Change Data Capture
In SQL Server, change data capture offers an effective solution to the challenge of efficiently performing incremental loads from source tables to data marts and data warehouses.
Source tables change over time. A data mart or data warehouse that is based on those tables needs to reflect these changes. However, a process that periodically copies a snapshot of the entire source consumes too much time and resources. Alternate approaches that include timestamp columns, triggers, or complex queries often hurt performance and increase complexity. What is needed is a reliable stream of change data that is structured so that it can easily be applied by consumers to target representations of the data. Change data capture in SQL Server provides this solution.
The change data capture feature of the Database Engine captures insert, update, and delete activity applied to SQL Server tables, and makes the details of the changes available in an easily-consumed, relational format. The change tables used by change data capture contain columns that mirror the column structure of the tracked source tables, along with the metadata needed to understand the changes that have occurred on a row by row basis.
Note |
|---|
Change data capture is available only in SQL Server 2008 Enterprise, Developer, and Evaluation editions. |
For more information:Change Data Capture
An Integration Services package can easily harvest the change data in the SQL Server databases to perform efficient incremental loads to a data warehouse. However, before you can use Integration Services to load change data, an administrator must enable change data capture on the database and the tables from which you want to capture changes. For more information on how to configure change data capture on a database, see Configuring Change Data Capture.
Once an administrator has enabled change data capture on the database, you can create a package that performs an incremental load of the change data. The following diagram shows the steps for creating such a package that performs an incremental load from a single table:

As shown in the previous diagram, creating a package that performs an incremental load of changed data involves the following steps:
Working with Change Data from Multiple Tables
The process outlined in the previous diagram and steps involves an incremental load from a single table. When having to perform an incremental load from multiple tables, the overall process is the same. However, the design of the package needs to be changed to accommodate the processing of multiple tables. For more information on how to create a package that performs an incremental load from multiples tables, see Performing an Incremental Load of Multiple Tables.
Integration Services provides two samples that demonstrate how to use change data capture in packages. For more information, see the following topics:
Readme_Change Data Capture for Specified Interval Package Sample
Readme_Change Data Capture since Last Request Package Sample

Note