by Jamie Thomson, SQL Server MVP and Senior Technical Consultant at
Conchango PLC. You can view Jamie’s blog at http://blogs.conchango.com/jamiethomson/.
Introduction
This article explains what SQL Server Integration Services
configurations are used for, why you should use Integration Services
configurations, and what options you have for leveraging configurations. It
will also make some simple recommendations that are based on my experiences of
building Integration Services packages in a real-world environment. An
understanding of the terms “package”, “Business Intelligence Development
Studio”, and “dtexec.exe” in the context of Integration Services is assumed.
Why Use Configurations?
Integration Services packages are used primarily to manage
the movement of data from one place to another. To do that, the package needs
to know where it is moving data from
and where it is moving data to. (I
call these places the data sinks.)
Typically Integration Services packages are built on a
different environment to where they are intended to be executed in production.
For that reason alone, it is important that the package developer has a way of
informing the package where those data sinks are located.
What Is an Integration Services Configuration?
People often refer to the .dtsConfig file that the
configuration wizard within the Business Intelligence Development Studio
produces as an Integration Services configuration, but that is not the case. A
.dtsConfig file is actually a collection
of Integration Services configurations, and that distinction is important.
A configuration is a name-value pair. The name is a property
within a package that needs to be changed, and the value is the value to be
assigned to that property. In the following example configuration taken from a
.dtsConfig file, the property is ServerName,
and its new value is SomeServer:
<Configuration ConfiguredType="Property"
Path="\Package.Connections[AdventureWorks].Properties[ServerName]"
ValueType="String">
<ConfiguredValue>SomeServer</ConfiguredValue>
</Configuration>
Keep this important point in mind: a .dtsConfig file is not
a configuration, it is a collection
of one or more configurations.
Here is another example taken from a configuration that is
stored in a SQL Server database. In this example, I have highlighted the
property to be changed and the value that will be assigned to it:
.jpg)
Figure 1: Example of a SQL Server configuration
There are several other methods of providing a configuration
using this name-value pair notation. Those methods will be explored later in
this article.
Integration Services Properties and Objects
An Integration Services package consists of many objects,
most of which fall into one of the following categories:
-
Containers (this includes all
tasks, as well as other containers such as the ForEach Loop Enumerator)
-
Variables
-
Connection Managers
-
Data Flow Components
However you cannot apply a configuration to an object; you
can only apply a configuration to a property
of an object. For example, if you wanted to change the value returned when
referencing a package variable, you would apply the configuration to the Value property of the variable, not to
the variable object itself. In the example property path below, you can see how
we are calling out a particular variable, and then the Value property of that variable:
\Package.Variables[User::MyVar].Properties[Value]
Property paths are integral to how configurations work, and
learning the syntax can be helpful, although it is not required.
Configuration Types
There five basic types of Integration Services
configurations.
XML Configuration File
This is the most commonly used configuration type. An XML
file, which by default has the .dtsConfig suffix, is created that contains one
or more configurations. XML configuration files are popular because they can be
easily deployed onto multiple environments simply by copying the file.
A disadvantage of XML configuration files is that the path
to the file must be hard-coded within the package. However this requirement can
be alleviated by using indirect configurations.
“Indirect configurations” is a term used to describe the
practice of using an operating system environment variable to redirect the
Integration Services execution engine to the location of the XML configuration
file.The term “indirect configuration”
is a slight misnomer, because the environment variable does not point at a
configuration, it points at a configuration file; but this has become the
accepted nomenclature.
A configuration stored in an XML configuration file stores
both the property path and the value outside the package.
Environment Variable Configuration
An environment variable configuration takes the value for a
property from whatever is stored in a named environment variable.
An environment variable configuration stores the property
path inside the package and the value outside the package.
Note that an environment variable configuration by itself is
not an example of an indirect
configuration.
Parent Package Configuration
Parent package configurations are used to fetch a value from
a variable in a calling package.
Just like environment variable configurations, a parent
package configuration stores the property path inside the package and the value
outside the package.
Registry Configuration
The value to be applied to a package property is stored in a
registry entry.
Just like environment variable configurations and parent
package configurations, a registry configuration stores the property path
inside the package and the value outside the package.
SQL Server Configuration
SQL Server configurations are stored in a SQL Server table.
The table can have any name you like, and can be in any database on any server
that you like.
If you are using a SQL Server configuration, then you can
and should make it an indirect configuration. As explained earlier in this
article, when using indirect configurations, an environment variable stores the
location of the configurations themselves.
A configuration stored in SQL Server stores both the
property path and the value outside the package.
Feature Comparison of Configuration Types
This table provides a high-level feature comparison of each
configuration type.
Configuration Type
|
Where is the property path stored?
|
Where is the value stored?
|
Indirect configurations possible?
|
XML Configuration File
|
External to the package
|
External to the package
|
Yes
|
Environment variable
|
In the package
|
External to the package
|
No
|
Parent package
|
In the package
|
External to the package
|
No
|
Registry
|
In the package
|
External to the package
|
No
|
SQL Server
|
External to the package
|
External to the package
|
Yes
|
Best practices for Integration Services configurations
Consider command-line options as an alternative to
configurations
You may expect that I am going to recommend that you should
always define configurations in your Integration Services packages. Although I
generally recommend that approach, Integration Services provides an important
alternative to defining configurations within packages - namely the /SET and
/CONFIGFILE options of dtexec.exe:
-
The /SET option of dtexec lets
the person running the package apply a value to some property in the package
that is being run. The property in question is defined using the same property
path syntax as discussed previously.
-
The /CONFIGFILE option of
dtexec lets the person running the package tell the package to use an XML
configuration file, even if one has not been defined in the package.
There are significant advantages to using /SET and
/CONFIGFILE. Above all, these run-time options mean that your packages can be
more portable. That is, they do not have any dependencies on the execution
environment. All of the configuration options discussed previously rely on the
existence of either a file location, an environment variable, a SQL Server
instance, a registry entry or a parent package. However, if you don’t define
any configurations in yourpackages,
then /SET and /CONFIGFILE let you take your packages to any environment and
still provide configuration values at run time.
There are also disadvantages to using the /SET and
/CONFIGFILE options. For example, the Execute Package Task does not have a
similar feature that reproduces their behavior. Therefore, if you have a
solution that uses parent and child packages, you will probably find that you
need to make use of parent package configurations. Another disadvantage of /SET
and /CONFIGFILE is that there is no similar feature in Business Intelligence
Development Studio that lets you test those options at design time.
The /SET and /CONFIGFILE options are very powerful features,
but they are most appropriate for a particular class of packages - namely
,packages that encapsulate all functionality in one place and are not expected
to be called from other packages. Of course, you can use /SET and /CONFIGFILE
alongside normal configurations, and in many cases that may be the best
approach.
Understand the limitations of configurations
Not all properties can be set by using configurations.
Consider the Transfer SQL Server Objects Task, for example:
.jpg)
Figure 2: Properties of the Transfer SQL Server
Objects task
The TablesList, ViewsList, StoredProceduresList,
UserDefinedFunctionsList, DefaultsList, UserDefinedDataTypesList, and others
are all properties of the task, but they are also collections. In both SQL
Server 2005 and SQL Server 2008 Integration Services, we cannot set
configurations for properties that are collections.
Use Configurations Sparingly
Just because it is possible to set a configuration on a
property, it may not be sensible to do so.Be prudent when planning your configurations, and only apply them where
necessary. It is rare that you need more configurations than there are connection
managers in your package.
Configure Only the ConnectionString Property for
Connection Managers
After reading that statement, you may ask: “But what if I want to set the username and
password for each connection manager? That’s two configurations already.”
That’s true, but this leads us to my next tip.If you want to put configurations on a
connection manager, then you should apply a configuration to the
ConnectionString property only. There is no need to apply a configuration for
each of the ServerName, UserName, Password, and InitialCatalogproperties. Doing so would be cumbersome, and
many connection strings require more information than those four properties can
provide. So, in the interests of flexibility and ease of administration, ignore
those four properties in favor of the ConnectionString property.
Note: http://www.connectionstrings.com
is a great resource for help in building connection strings.
Use Indirect Configurations in a Team Environment
If your Integration Services solution involves multiple
Integration Services developers, then it is sometimes preferable for each
developer to develop locally in a sandbox environment. In this situation, each
developer requires configuration definitions that are local to his or her
sandbox. In this scenario you should use a flexible configuration type such as
indirect configurations that can easily be adapted for each developer’s
computer.
Conclusion
This article has provided an overview of configurations in
Integration Services. It has explained what a configuration is and what it is
not, and the different configuration types. It has explored alternatives to
using configurations, and provided some simple tips for their use.
About the author. Jamie Thomson is a SQL Server MVP based
in London, UK. He specializes in SQL Server Integration Services and frequently
blogs in depth about Integration Services issues, techniques, and best
practices.