Data from Other RDBMS Vendors (DW)---a Technical Reference Guide for Designing Mission-Critical DW Solutions
Want more guides like this one? Go to Technical Reference Guides for Designing Mission-Critical Solutions. |
It is not uncommon for one or more other database platforms to serve as the source of both transactional (OLTP) data as well as from other data warehouses. Key distinguishing features primarily focus on the connectivity to the other database system drivers, as well as the geographical proximity of the other database (DB) server source to the data warehouse target. Other important items include volume, frequency of extract/load, complexity of data transformation, access windows to extract data.
The primary RDBMS vendors include IBM (DB2), Oracle, Sybase, and Teradata.
Best Practices
This section provides some best practice guidance and resources for more information.
Attunity drivers for Oracle and Teradata:Attunity Delivers New Oracle and Teradata Connectors for Microsoft SQL Server 20081 and Microsoft Download Center: Microsoft Connectors for Oracle and Teradata by Attunity2
ODBC FAQ 3
Using the JDBC Connectivity Layer in Oracle Warehouse Builder 4
Moving Large Amounts of Data Between Oracle and SQL Server: Findings and Observations"5
IBM (DB2)-Related: Often DB2 sources require DB2 version specific drivers for SSIS to connect to DB2 source. Consult relevant owners of DB2 source systems, obtain appropriate drivers and have them installed in all environments SSIS will be executed in (Development, Test, Production).
Case Studies and References
The following case studies can be used for reference.
SSIS and Data Sources 10 contains details on various sources for SSIS.
Detailed explanation for handling File sources are included in Chapter 4, starting on page 97 in SQL Server 2008 Integration Services Problem—Design—Solution.
SQL Server has been deployed for many Tier-1 and lower Tier applications by customers. Examples include: Barnes and Noble, Hilton Hotels, Stein Mart.
Questions and Considerations
This section provides questions and issues to consider when working with your customers.
Become familiar with other DBMS Connectivity Requirements
Determine all connectivity requirements (e.g. connection managers, OLEDB, ODBC, JDBC, and so on).
Understand source database design (Star Schema, normalized).
Understand if the client will rely on Flat Files versus ETL/ELT (e.g. SSIS).
For Flat Files:
File structure - Fixed-length
File structure - Multi-level/records with record-type indicator
File structure - Variable-length, delimited
File structure - End of Record (EOR) Delimiter
File structure - End of File (EOF) Marker
Mapping requirements to external SAN Storage
File storage platform and architecture
Can extracts be multi-threaded?
Understand Data type Mappings between source/target columns.
Determine ETL/ELT requirements with regards to:
Data volume
Data "latency" or timing (how frequently does data need to be loaded)
Transformation complexity (from simple table lookups to complex business-oriented /data enrichment transforms)
Determine geographical proximity of source and target systems. If there are issues related to network infrastructure and size of source data consider the following strategy:
Extract Source Data (from Oracle, Teradata, or another provider), to a .txt (preferably pipe (|) delimited file),.
Move this file to disc storage local to destination SQL Server.
Load data from flat file to SQL Server.
Understand characteristics of the target (is it another SQL Server instance, or a flat file, or another RDBMS or analytical platform (e.g. SSAS)?
Establish strategy for archiving source data from external sources. Often source data is only available for a predefined window of time slotted for ETL. This doesn’t provide for possible data re-loading and auditing purposes. Archive source data either by storing flat files that source data was extracted to, or by archiving landing databases containing unchanged source data is recommended practice.
Custom SSIS connection managers and tasks can be obtained from trusted third-party providers. One of the examples is SFTP connection manager. Rather than scripting access to secure FTP consider using reliable third-party components as SFTP Task.11
Appendix
Following are the full URLs for the hyperlinked text.
1 Attunity Delivers New Oracle and Teradata Connectors for Microsoft SQL Server 2008 http://www.attunity.com/press_releases.aspx?newsId=1200
2 Microsoft Download Center: Microsoft Connectors for Oracle and Teradata by Attunity http:/
3 ODBC FAQ http://www.orafaq.com/wiki/ODBC_FAQ
4 Using the JDBC Connectivity Layer in Oracle Warehouse Builder http://www.oracle.com/technetwork/articles/datawarehouse/vasiliev-owb-jdbc-183946.html
5 Moving Large Amounts of Data Between Oracle and SQL Server: Findings and Observations http://sqlcat.com/technicalnotes/archive/2008/08/09/moving-large-amounts-of-data-between-oracle-and-sql-server-findings-and-observations.aspx
6 EMC Consulting Blog http://consultingblogs.emc.com/
7 IBM DB2 Driver for ODBC and CLI Overview http://publib.boulder.ibm.com/infocenter/db2luw/v9/index.jsp?topic=/com.ibm.db2.udb.apdv.cli.doc/doc/c0023378.htm
8 An Overview of DB2 and Java DataBase Connectivity (JDBC)" http://www.ibm.com/developerworks/data/library/techarticle/0203zikopoulos/0203zikopoulos.html
9 Connecting to DB2 from SSIS http://geekswithblogs.net/13DaysaWeek/archive/2010/05/16/connecting-to-db2-from-ssis.aspx
10 SSIS and Data Sources https://social.technet.microsoft.com/wiki/contents/articles/ssis-and-data-sources.aspx
11 SFTP Task http://www.cozyroc.com/ssis/sftp-task