Data from Other RDBMS Vendors (DW)---a Technical Reference Guide for Designing Mission-Critical DW Solutions

Want more guides like this one? Go to Technical Reference Guides for Designing Mission-Critical Solutions.

It is not uncommon for one or more other database platforms to serve as the source of both transactional (OLTP) data as well as from other data warehouses. Key distinguishing features primarily focus on the connectivity to the other database system drivers, as well as the geographical proximity of the other database (DB) server source to the data warehouse target. Other important items include volume, frequency of extract/load, complexity of data transformation, access windows to extract data.

The primary RDBMS vendors include IBM (DB2), Oracle, Sybase, and Teradata.

Best Practices

This section provides some best practice guidance and resources for more information.

Case Studies and References

The following case studies can be used for reference.

  • SSIS and Data Sources 10 contains details on various sources for SSIS.

  • Detailed explanation for handling File sources are included in Chapter 4, starting on page 97 in SQL Server 2008 Integration Services Problem—Design—Solution.

  • SQL Server has been deployed for many Tier-1 and lower Tier applications by customers. Examples include: Barnes and Noble, Hilton Hotels, Stein Mart.

Questions and Considerations

This section provides questions and issues to consider when working with your customers.

  • Become familiar with other DBMS Connectivity Requirements

  • Determine all connectivity requirements (e.g. connection managers, OLEDB, ODBC, JDBC, and so on).

  • Understand source database design (Star Schema, normalized).

  • Understand if the client will rely on Flat Files versus ETL/ELT (e.g. SSIS).

  • For Flat Files:

    • File structure - Fixed-length

    • File structure - Multi-level/records with record-type indicator

    • File structure - Variable-length, delimited

    • File structure - End of Record (EOR) Delimiter

    • File structure - End of File (EOF) Marker

    • Mapping requirements to external SAN Storage

    • File storage platform and architecture

    • Can extracts be multi-threaded?

  • Understand Data type Mappings between source/target columns.

  • Determine ETL/ELT requirements with regards to:

    • Data volume

    • Data "latency" or timing (how frequently does data need to be loaded)

    • Transformation complexity (from simple table lookups to complex business-oriented /data enrichment transforms)

  • Determine geographical proximity of source and target systems. If there are issues related to network infrastructure and size of source data consider the following strategy:

    • Extract Source Data (from Oracle, Teradata, or another provider), to a .txt (preferably pipe (|) delimited file),.

    • Move this file to disc storage local to destination SQL Server.

    • Load data from flat file to SQL Server.

  • Understand characteristics of the target (is it another SQL Server instance, or a flat file, or another RDBMS or analytical platform (e.g. SSAS)?

  • Establish strategy for archiving source data from external sources. Often source data is only available for a predefined window of time slotted for ETL. This doesn’t provide for possible data re-loading and auditing purposes. Archive source data either by storing flat files that source data was extracted to, or by archiving landing databases containing unchanged source data is recommended practice.

  • Custom SSIS connection managers and tasks can be obtained from trusted third-party providers. One of the examples is SFTP connection manager. Rather than scripting access to secure FTP consider using reliable third-party components as SFTP Task.11

Appendix

Following are the full URLs for the hyperlinked text.

1 Attunity Delivers New Oracle and Teradata Connectors for Microsoft SQL Server 2008 http://www.attunity.com/press_releases.aspx?newsId=1200

2 Microsoft Download Center: Microsoft Connectors for Oracle and Teradata by Attunity http:/

3 ODBC FAQ http://www.orafaq.com/wiki/ODBC_FAQ

4 Using the JDBC Connectivity Layer in Oracle Warehouse Builder http://www.oracle.com/technetwork/articles/datawarehouse/vasiliev-owb-jdbc-183946.html

5 Moving Large Amounts of Data Between Oracle and SQL Server: Findings and Observations http://sqlcat.com/technicalnotes/archive/2008/08/09/moving-large-amounts-of-data-between-oracle-and-sql-server-findings-and-observations.aspx

6 EMC Consulting Blog http://consultingblogs.emc.com/

7 IBM DB2 Driver for ODBC and CLI Overview http://publib.boulder.ibm.com/infocenter/db2luw/v9/index.jsp?topic=/com.ibm.db2.udb.apdv.cli.doc/doc/c0023378.htm

8 An Overview of DB2 and Java DataBase Connectivity (JDBC)" http://www.ibm.com/developerworks/data/library/techarticle/0203zikopoulos/0203zikopoulos.html

9 Connecting to DB2 from SSIS http://geekswithblogs.net/13DaysaWeek/archive/2010/05/16/connecting-to-db2-from-ssis.aspx

10 SSIS and Data Sources https://social.technet.microsoft.com/wiki/contents/articles/ssis-and-data-sources.aspx

11 SFTP Task http://www.cozyroc.com/ssis/sftp-task