Share via


Lookup Transformation

The Lookup transformation performs lookups by joining data in input columns with columns in a reference dataset. The reference dataset can be an existing table or view, a new table, or the result of an SQL statement. The Lookup transformation uses an OLE DB connection manager to connect to the database that contains the data that is the source of the reference dataset. For more information, see OLE DB Connection Manager

You can configure the Lookup transformation in the following ways:

  • Specify the table or view that contains the reference dataset.
  • Generate a reference table by specifying an SQL statement.
  • Specify joins between the input and the reference dataset.
  • Add columns from the reference dataset to the transformation output.
  • Specify that part or all of the reference dataset be cached to improve performance, by using a caching SQL statement.
  • Map parameters in a cached SQL statement to input columns.
  • Specify how many megabytes of memory the transformation can use in 32-bit and 64-bit environments. The 32-bit environment imposes a limit of 3072 megabytes, whereas the 64-bit environment does not.

The Lookup transformation performs an equi-join between values in the transformation input and values in the reference dataset. Using an equi-join means that each row in the transformation input must match at least one row from the reference dataset. If there is no matching entry in the reference dataset, no join occurs and no values are returned from the reference dataset. This is an error, and the transformation fails, unless it is configured to ignore errors or redirect error rows to the error output. If there are multiple matches in the reference table, the lookup returns only the first match based on the lookup query. If multiple matches are found, no error or warning is generated unless the Lookup transformation is configured to use full precaching. If full precaching is used, a warning is generated when multiple matches are detected as the cache is filled.

The join can be a composite join, which means that you can join multiple columns in the transformation input to columns in the reference dataset. The transformation supports join columns with any data type, except for DT_R4, DT_R8, DT_TEXT, DT_NTEXT, or DT_IMAGE. For more information, see Integration Services Data Types.

Typically, values from the reference dataset are added to the transformation output. For example, the Lookup transformation can extract a product name from a table using a value from an input column, and then add the product name to the transformation output. The values from the reference table can replace column values or can be added to new columns.

The lookups performed by the Lookup transformation are case sensitive. You can avoid lookup failures caused by case differences in data by first converting the data to uppercase or lowercase using the Character Map transformation, and then including the UPPER or LOWER functions in the SQL statement that generates the reference table. For more information, see Character Map Transformation, UPPER (Transact-SQL), and LOWER (Transact-SQL).

You can also avoid lookup failures by using no caching and reading from a case-insensitive database.

This transformation has one input, one regular output, and one error output.

Caching Types

Memory for a reference dataset can be allocated in the following ways:

  • Full precaching, in which the complete reference dataset is read before the input is processed. This is the default caching type. To configure a Lookup transformation to use full precaching, make sure that all options on the Advanced tab of the Lookup Transformation Editor are cleared.
  • Partial caching, in which the Lookup transformation specifies the size of the cache that is loaded with reference data. This option is available only with connections that support keyed access. To configure a Lookup transformation to use partial caching, on the Advanced tab of the Lookup Transformation Editor, select Enable memory restriction, and then select Enable caching or Modify the SQL statement or both Enable caching and Modify the SQL statement .
  • No caching, in which the reference dataset is accessed by each row in the rowset. To configure a Lookup transformation to use no caching, on the Advanced tab of the Lookup Transformation Editor, select Enable memory restriction and clear all other options .

For more information about the Advanced tab, see Lookup Transformation Editor (Advanced Tab).

A Lookup transformation that has been configured to use partial or no caching will fail if a lookup operation matches columns that contain null values, unless you manually update the SQL statement to include an OR ISNULL(ColumnName) condition. If full precaching is used, the lookup operation succeeds.

Note

If possible, avoid using columns that may contain null values in lookup operations. If a column contains null values, configure the Lookup transformation to use an error output that directs rows that have no matching rows in the reference table to a separate transformation output. Alternatively, consider using full caching, which supports lookup operations on null values.

Integration Services and SQL Server differ in the way they compare strings. If the Lookup transformation is configured to use full precaching, Integration Services does the lookup comparison in the cache; otherwise, the lookup operation uses a parameterized SQL statement and SQL Server does the lookup comparison. This means that the Lookup transformation may return a different number of matches from the same lookup table depending on the cache type.

Caching SQL Statement

The Lookup transformation can be configured to use a caching SQL statement. The SQL statement can select a subset of the reference dataset, limiting its size. If the reference dataset is very large, using a caching statement limits the amount of memory used by the reference dataset.

The caching SQL statement can use values or parameters in its WHERE clause. The parameters are mapped to input columns and are updated by values in the input columns at run time.

Troubleshooting the Lookup Transformation

Starting in Microsoft SQL Server 2005 Service Pack 2 (SP2), you are able to log the calls that the Lookup transformation makes to external data providers. You can use this new logging capability to troubleshoot the populating of the cache from external data sources that the Lookup transformation performs. To log the calls that the Lookup transformation makes to an external data provider, enable package logging and select the Diagnostic event at the package level. For more information, see Troubleshooting Package Execution.

Configuring the Lookup Transformation

You can set properties through SSIS Designer or programmatically.

For more information about the properties that you can set in the Lookup Transformation Editor dialog box, click one of the following topics:

The Advanced Editor dialog box reflects the properties that can be set programmatically. For more information about the properties that you can set in the Advanced Editor dialog box or programmatically, click one of the following topics:

For more information about how to set properties, click one of the following topics:

See Also

Concepts

Fuzzy Lookup Transformation
Term Lookup Transformation
Creating Package Data Flow
Integration Services Transformations

Help and Information

Getting SQL Server 2005 Assistance

Change History

Release History

12 December 2006

New content:
  • Added information about how to use the UI to enable different cache types.
  • Added information about how SQL Server 2005 SP2 includes new logging messages that enable users to troubleshoot the calls that the transformation makes to external data providers.

14 April 2006

New content:
  • Described the ability to specify cache sizes for 32-bit and 64-bit environments.
  • Added information about effects of using different cache types.

5 December 2005

Changed content:
  • Clarified matching behavior.