Data Interoperability Overview
How Microsoft Makes Heterogeneous Enterprises Cross-Platform Conversant
The majority of mission-critical corporate data resides in diverse data stores on a variety of computing platforms. Therefore, successful enterprise solutions must be able to function within heterogeneous environments and access data, regardless of form or location.
The foundation for developing enterprise data interoperability solutions on the Microsoft Windows® platform is the Microsoft Windows Distributed interNet Applications Architecture. Otherwise known as Windows DNA, this architecture is based on the widely used Component Object Model (COM), and specifies how to develop robust, scalable, distributed applications using the Windows platform. It also extends existing data and external applications to support Web-based applications and a wide range of client devices, maximizing the reach of applications.
The Keys: Interoperability and Reuse
Two key qualities of Windows DNA are its interoperability and its capability of reuse. Unlike traditional software development, which requires each application to be built from scratch, the COM enables developers to create complex applications using a series of small software objects (COM components). Examples of a component could include a credit card authorization procedure, or business rules for calculating shipping costs. The COM programming model speeds up the development process by enabling multiple development teams to work on different parts of an application simultaneously.
COM also offers the advantage of programming language independence. This means that Windows developers can create COM components using familiar tools and languages, such as Microsoft Visual Basic® and Visual C++®. For non-Windows programmers, including mainframe COBOL and Web publishers, COM components can be accessed from simple scripting languages, such as VBScript and Microsoft JScript®. Windows DNA simplifies development by providing access to a wide range of services and products developed using a consistent object model: COM.
An example of these services is what Microsoft refers to as "COM services for interoperability," which include network, data, application, and management services that are available as part of existing Microsoft products, such as SQL Server and Microsoft SNA Server. COM services for interoperability provide a common approach to system integration using the wide range of COM components available today.
The Framework: Four-Layer Interoperability
Microsoft has defined a four-layer framework for interoperability based on industry standards for network, data, applications, and management. Microsoft provides access to interoperability components in each of these four categories. This document focuses on the Data interoperability layer, providing an overview of the wide range of COM components available for accessing multiple data stores across an enterprise environment.
Enterprises run their daily operations by relying on multiple data sources, including database servers, legacy flat-file records, e-mail correspondence, personal productivity documents (spreadsheets, reports, or presentations), and Web-based information publishing servers. Typically, applications, end-users, and decision-makers access these data sources by employing a variety of non-standard interfaces. Data interoperability standards allow users to transparently access and modify data throughout the enterprise.
Universal Data Access is What Everyone Wants
The Microsoft data interoperability strategy, known as Universal Data Access, uses COM components to provide one consistent programming model for access to any type of data, regardless of where that data may be found in the enterprise. As an easy-to-use programming architecture that is both tool and language independent, Universal Data Access provides COM objects for high-performance access to a variety of relational and non-relational information sources.
The technologies that comprise the Universal Data Access strategy enable organizations to integrate diverse data sources, create easy-to-maintain solutions, and use their choice of best-of-breed tools, applications, and platform services. To leverage existing investments, Universal Data Access does not require expensive and time-consuming movement of data into a single data store, nor does it require commitment to a single vendor's data products. Instead, Universal Data Access is based on open industry specifications with broad industry support, and works with all major established database platforms.
The Microsoft Data Access Components (MDAC) are the key technologies that enable Universal Data Access. Through MDAC, data-driven client/server applications deployed over the Web or an LAN can easily integrate information from a variety of sources, both relational and non-relational. These technologies include Open Database Connectivity (ODBC), OLE DB, and Microsoft ActiveX® Data Objects (ADO).
Industry-Standard ODBC and OLE DB for Cross-Platform Access
ODBC is an industry standard and a component of Microsoft Windows Open Services Architecture. The ODBC interface makes it possible for applications to access data stored in almost any relational database management system. The Microsoft Open Database Connectivity data access interface continues to provide a unified way to access relational data as part of the OLE DB specification. ODBC is a widely accepted application programming interface (API) for database access, and is based on the Call-Level Interface (CLI) specifications from X/Open and ISO/IEC for database APIs. It also uses Structured Query Language (SQL) as its database access language.
ODBC is widely supported by Microsoft, third-party application development products, and end-user productivity applications. Microsoft has also implemented a number of ODBC drivers to access diverse back-end data stores. Additionally, OLE DB includes a bridge to ODBC, to enable continued support for the broad range of ODBC-relational database drivers already available today. The Microsoft OLE DB Provider for ODBC leverages existing ODBC drivers, thereby ensuring immediate access to relational databases for which an ODBC driver exists, but for which an OLE DB provider has not been written yet.
Microsoft offers a number of ODBC drivers as part of the Microsoft Data Access Components, which ships as a feature of many popular Microsoft products, including SQL Server, Microsoft Office, Microsoft BackOffice, Microsoft SNA Server, and Microsoft Visual Studio®. The following ODBC drivers are included in MDAC version 2.1:
Microsoft ODBC Driver for SQL Server
Microsoft ODBC Driver for Oracle
Microsoft ODBC Driver for Microsoft Visual FoxPro®
Microsoft ODBC Driver for Access (Jet engine)
Microsoft SNA Server 4.0 with Service Pack 2 ( Microsoft ODBC Driver for DB2)
Many third-party independent software vendors offer ODBC drivers for many back-end data sources. In addition, for Java programmers, JDBC is a technology for accessing SQL database data from a Java client program. Microsoft offers a JDBC-to-ODBC bridge that allows Java programmers to access back-end data sources using available ODBC drivers. The Microsoft JDBC-ODBC bridge is part of the core set of classes that come with the Microsoft Virtual Machine (VM) for Java. For more information, see http://www.microsoft.com/java/.
OLE DB: Providers, Consumers, and Service Components
OLE DB is a strategic system-level programming interface to move data across an organization. It is an open specification designed to build on the success of ODBC by providing an open standard for accessing all kinds of data. While ODBC was created to access relational databases, OLE DB is designed for relational and non-relational information sources alike.
OLE DB components consist of data providers (which contain and expose data), data consumers (which use data), and service components such as query processors and cursor engines (which gather and sort data). OLE DB interfaces are designed to help diverse components integrate smoothly, so that OLE DB component vendors can quickly bring high-quality products to market.
OLE DB data providers implement a set of core OLE DB interfaces that offer basic functionality, enabling other OLE DB data providers, service components, and consumer applications to interoperate in a standard, predictable manner. Data consumers can be any software program requiring access to a broad range of data, including development tools, personal productivity applications, database services products, or OLE DB service components. OLE DB service components implement functionality not natively supported by some simple OLE DB data providers.
Universal Data Access allows for the development of generic OLE DB consumer applications that access many back-end data sources in a single, uniform manner. This allows enterprises to pool resources and migrate from one back-end data store to another as efficiencies allow or business needs require.
Bridging to High-Level Programming Languages with ActiveX Data Objects (ADO)
ActiveX Data Objects are a major OLE DB data consumer. They provide a means to develop flexible and efficient data interoperability solutions using high-level programming languages such as Visual Basic. Wrapped around OLE DB, ADO are a strategic application-level programming interface to data and information, and they provide consistent, high-performance access to data while supporting a variety of development needs, including the creation of front-end database clients and middle-tier business objects that use applications, tools, languages, or Internet browsers.
ADO are designed to be the one data interface needed for single and multi-tier client/server and Web-based data-driven solution development. Their primary benefits are ease of use, high speed, low memory overhead, and a small disk footprint. ADO provide an easy-to-use interface to OLE DB, which in turn provides the underlying access to data. ADO are also easy to implement because they use a familiar metaphor, the COM automation interface, incorporated into all leading Rapid Application Development (RAD) tools, database tools, and languages (including scripting languages).
ADO uses a flatter and more flexible object model than any previous object-oriented data access technology. Any of the five top-level objects can exist independently of the other four. Unlike Data Access Objects, which require constructing a hierarchical chain of objects to be created prior to accessing data, ADO can be used to access data with just a couple lines of code. The secret of ADO's strength is due to the fact that it can connect to any OLE DB provider and still expose the same programming model, regardless of the specific features implemented by a particular provider.
How SQL Server 7.0 Accesses and Integrates Heterogeneous Data Sources
SQL Server 7.0 offers a number of tools for accessing and integrating enterprise data sources with SQL Server-stored data. Among these tools are three key features of SQL Server 7.0: Distributed Query Processor (DQP), Data Transformation Services (DTS), and Replication.
DQP enables application developers to develop heterogeneous queries that join tables in disparate databases, and DTS can simultaneously access multiple heterogeneous sources on local and remote computers. In addition, DQP supports queries against both relational and non-relational data by using OLE DB interfaces implemented in OLE DB providers. Using DQP, SQL Server administrators and developers can create linked server queries that run against multiple back-end data sources with little or no modifications required.
DTS is a data extraction, transformation, and loading tool that can use any data source accessible via an ODBC driver or OLE DB provider. It accesses many popular data sources directly via OLE DB providers for SQL Server, Access, Oracle, and DB2. Access to other data sources is accomplished via the Microsoft OLE DB Provider for ODBC. Enterprise decision-makers need mission-critical data in real time and, in many cases, this data is stored in a variety of formats and in a variety of back-end stores. Using DTS, organizations can import and export data between multiple heterogeneous sources using an OLE DB–based architecture. They can even pull data into SQL Server and use the Microsoft OLAP Server to analyze complex information or create a local data warehouse on which to develop an online transaction processing (OLTP) system for Web commerce.
Replication is a powerful tool for distributing data across an enterprise. The replication technology in SQL Server creates duplicate copies of data and moves the copies to non-SQL Server data sources, synchronizing the data types and data values automatically. The replication feature of SQL Server 7.0 is based on the "publish and subscribe" model and enables incremental data movement from SQL Server to other data sources. Subscribers can be one of many popular enterprise-wide data sources that are accessible via an ODBC driver or OLE DB provider, such as Oracle and DB2 databases. For more information on how SQL Server integrates heterogeneous data sources.
With the advent of technologies such as OLE DB and ADO, it is becoming increasingly easier to bridge the gap between the islands of data that exist in the enterprise. Developers are now able to build enterprise solutions that leverage all the investments that companies have made over the past 20 years in data storage infrastructure. Microsoft's commitment to developing technologies to simplify interoperability is focused on providing a consistent set of high performance interfaces to provide easy access to the widest range of systems.