ADO Limitations in the Programmable Web

Article
04/05/2007

This article may contain URLs that were valid when originally published, but now link to sites or pages that no longer exist. To maintain the flow of the article, we've left these URLs in the text, but disabled the links.

Features
ADO.NET: A Bridge to the Future
Understand the newest incarnation of ADO

Dino Esposito

Before the Web entered its programming age, data access was a relative issue for most IT managers and consultants; you had to take what you got. The primary concern was choosing the most cost-effective database server, and all the modules of an involved system had to comply with the server. Client/server applications were the typical expression of this two-tier model.

As the Web became more interactive and widely used, the need for a third, middle tier became obvious. The middle tier is a logical layer in which the data-access components usually reside. The data-access component is the only piece of code that needs to know the details of the database, and the data-access component is the first piece of code you need to modify if you have to port or upgrade your server.

Then, the three-tier model of systems grew into Windows DNA—now called Microsoft .NET Server—systems. Microsoft approached data access by using a unifying model that focused on content rather than data format and storage media. This model found its concrete expression in the Universal Data Access (UDA) strategy, the inspiring principle behind the OLE DB architecture. Microsoft designed ADO to provide easy, seamless access to OLE DB functionality through Visual Basic (VB) and Active Server Pages (ASP) COM components. ADO 2.0 was the first version of the .NET framework to support OLE DB. ADO evolved to version 2.6 within a couple of years by adding and enhancing XML support until ADO could extend its object model to match the functionality of any OLE DB enhancement (e.g., ADO 2.5 matched OLE DB's introduction of Row and Stream objects with similar ADO functionality).

ADO 2.0's key features exceed those of OLE DB. The advent of middle-tier components raised the question of how to provide fresh data to the presentation layer of multi-tiered systems. How does this presentation layer access that data? How do you open connections? Or should you manipulate disconnected record arrays (i.e., some presentation-layer records for which you dropped the connection)? ADO 2.0 and later provided both options through server cursors and disconnected recordsets (i.e., COM objects serialized over the network that you can download to the client for further offline use).

Although server cursors represent a tightly coupled and connected environment, where you always maintain open channels between tiers and drop connections only when you finish, a disconnected recordset is a stateful object that you can manipulate as a whole without needing to maintain a connection. A disconnected recordset uses a static, client-side cursor and can offer a data-source snapshot. A disconnected recordset is great for all those applications that have read-only purposes and need to move data between various system tiers. Most of today's Web applications require the transmission of data between tiers. To pull this data out, you often use fast, forward-only recordsets, encode the data in XML, then send it over the network to avoid having to maintain the session state on the Web server. (For more information about forward-only recordsets, see Michael Otey, SQL Seven, "ADO Performance Tips," page 80.) Because connections are crucial resources in a distributed environment, working disconnected ensures a high degree of scalability.

The Web, however, is a double-edged sword. The Web lets you connect to and interact with any kind of online service. However, the Web also requires a certain degree of interoperability because the involved services might run on different software and hardware platforms. You can work across heterogeneous platforms, using open standards and disregarding proprietary technology—even such a broadly used proprietary technology as COM+.

Today, Windows Web data-access applications take advantage of an ADO object's rich, convenient programming interface. However, ADO objects inherently target the Windows platform. Their COM-based nature makes ADO Recordsets difficult to use in a distributed, heterogeneous environment. In addition, whereas the target platform might let you use ADO Recordsets, they don't have the most efficient architecture. ADO.NET DataSet and DataReader are more efficient; however, if you don't have ADO.NET, sometimes you can be more efficient with plain XML or text.

Microsoft has optimized ADO Recordsets for data transmission in a Web environment, but COM type conversion remains a necessary step because COM types don't always match the ADO Recordset's data types (e.g., a String type must become a BSTR type). Thus, many people use XML as a universal glue between tiers—no matter which platforms are involved. Usually, you retrieve a recordset, save it to XML, transmit the resulting data stream, and let the receiver rebuild a recordset for further use. As interoperability and scalability increase, ADO doesn't give the best possible answer because it's not XML-based—ADO.NET is.

The .NET Framework promotes a new component model that replaces COM and COM+. The introduction of .NET is the next step in Microsoft's maturing component technology. Although several key COM features don't appear in .NET, .NET is similar in some ways to COM programming. Therefore, COM programmers should have an easy adjustment to .NET development. Microsoft specifically designed ADO.NET to be the data-access layer of the .NET Framework, and ADO.NET largely takes advantage of .NET's features.

Why does .NET need a new data-access layer to replace an existing and widely used one such as ADO? Microsoft designed .NET to meet the challenge of designing modern Web systems that retain the interactive behavior of client/server and desktop applications. In addition, .NET exploits the broad connectivity and interoperability of Web protocols.

ADO Recordsets aren't usable in native mode on non-Windows platforms, resulting in limited interoperability. To work around that limitation, you convert recordsets to XML and transmit the resulting XML recordsets. In ADO.NET the process of converting data to XML and transferring it over the network has been simplified and optimized. In addition, a database-centric data vision pervades the ADO object model. ADO sees data as a set of records from a data source, instead of as unique pieces of information. Data can't exist in ADO without the structure that the data provider uses to store or render the data.

ADO.NET DataSets and ADO Recordsets

ADO.NET is ADO revisited and improved from a Web point of view. Microsoft designed ADO.NET to be exactly what its name suggests: ADO plus .NET. ADO.NET is automatically connected to the network and is devoted to making Web data access easier and more effective. Two features enable these enhancements: disconnected recordsets and native support for XML. The absence of server cursors in ADO.NET is a natural consequence of the disconnected recordset approach. In native mode, ADO.NET stores a record table as an XML document and treats schema and data as distinct and replaceable elements.

If you think that these features aren't innovative because ADO already offers them, note that ADO.NET gives you many other options as well. ADO.NET can use connected or disconnected recordsets, depending on which cursor type and location you choose for a recordset. The ADO native storage format for a recordset is the Advanced Data TableGram (ADTG) file format, which is a proprietary, binary schema that represents a recordset's in-memory image. XML is an alternative, fixed, verbose output format. In ADO.NET, you can disconnect and natively render a recordset collection by using a default, but modifiable, XML schema.

The ADO.NET object model's principal object is the DataSet. Generally, a DataSet object is a recordset collection. The ADO.NET Framework provides all the recordset's database characteristics: sorting, paging, filtered views, relations, indexing, and primary keys.

A DataSet object represents an in-memory, feature-rich data cache. A DataSet object also organizes data in tables, each of which has no connection to the original data source. You can add tables that you obtained from a query by reading a local or remote XML file or loading the table from any accessible system resource, including memory and attached devices. You can sort, index, filter, or navigate a data table as an ADO Recordset.

You can use commands to fill a DataSet object with data collections. The same DataSet object can service multiple requests from multiple connections if you supply it with data tables in a .NET collection form. (The .NET data type that features a collection is ICollection.) A DataSet object is more generic than an ADO Recordset and, in contrast, abstracts the data source. However, a DataSet object remains a data store that works in memory; it doesn't fully replace a recordset. If you need to scroll a set of records only once and produce some sort of output, you should use the DataReader object, the .NET counterpart of forward-only, read-only recordsets. The DataReader object is a highly specialized object and, thus, lighter and smaller than a recordset. A recordset, in fact, is rather bloated because it can perform a number of different tasks. Compared to ADO Recordsets, a DataReader has no housekeeping code and no code other than the one needed to outfit its functions.

The possibility of managing multiple tables as a whole and relating them to each other is new in ADO.NET. You can persist and transmit any DataSet object as XML without any extra cost because a DataSet object is natively constructed in XML. So, you don't need to translate any part of a DataSet object to obtain an XML stream unless you want to change the underlying schema.

Highlights of ADO.NET Objects

ADO and ADO.NET have two distinct object models: ADO is for server platforms based on Windows 2000 and Windows NT; ADO.NET is for .NET-enabled platforms. Microsoft expects to ship the first .NET OS, Windows XP (formerly code-named Whistler), by the end of 2001. However, its successor (code-named Blackcomb) is more likely to provide a full-fledged .NET OS.

For migration, note that you can import your existing ADO code in .NET applications, thus saving your code investments. However, the same ADO code can hardly be ported to ADO.NET without a significant redesign. The object models you find in ADO and ADO.NET are different and follow the guidelines of different design centers.

ADO.NET is only for building Web systems based on a .NET-enabled server. ADO.NET is the data-access API of .NET applications. So, you should consider ADO.NET only if you plan to upgrade your servers to .NET. Having ADO and ADO.NET cooperate in the same application doesn't make sense. Although you can use both, at least in terms of design, it's not a good idea.

ADO.NET has several primary objects: DataSet, DataTable, DataColumn, DataRow, and DataRelation. Their main features are as follows.

DataSet. This object is a collection object that can contain any number of data tables plus all the tables' constraints, indexes, and relations. All this information is in XML, and you can process, traverse, and search any or all of the data. Figure 1 illustrates a typical DataSet object's schema; in this case, the DataSet object contains two tables, one from SQL Server and one from an Oracle database. The two tables connect through a relation that associates a group of source-table rows with a group of target-table rows (e.g., a master-to-detail relationship). In addition, an XML table relates to the Oracle table in a one-to-one (1:1) model.

DataTable. This object represents all the tables you might find in a DataSet object, as Figure 2 shows. You use the Tables property to access a collection of DataTables. Likewise, the DataSet's Relations property accesses the collection of all the established dataset relations. The Xml property shows the object's native XML representation. A .NET application can load the Xml string to rebuild the dataset.

Within the ADO.NET hierarchy, the DataTable object most closely maps to the ADO Recordset object. You can create and use tables inside or outside a dataset, depending on your specific goals. You can run commands manually—you must define the table's schema first—or against managed data providers to create and fill the tables (for more information about managed data providers, see the sidebar "Managed Providers").

DataColumn. A table schema contains column-specific information, including name, type, and attributes. You just create a new DataColumn object, specify the data you need, and add the column to the table:

  Dim dc As DataColumn 
dc = New DataColumn()
dc.DataType = System.Type.GetType("System.String")
dc.ColumnName = "NameOfTheColumn"

The column list is available at any time through the DataTable's Columns collection.

DataRow. To fill a table, you can use commands' automatic data-binding features or add rows manually by creating and inserting a DataRow object into the table. Then, you populate the rows' fields with data. You can navigate the DataTable's elements through the Rows collection, implement a sequential navigation model by using Rows, or jump to individual records through search or direct positioning.

Highlights of ADO.NET Objects

ADO.NET has several primary objects: DataSet, DataTable, DataColumn, DataRow, and DataRelation. Their main features are as follows.

  Dim dc As DataColumn 
dc = New DataColumn()
dc.DataType = System.Type.GetType("System.String")
dc.ColumnName = "NameOfTheColumn"

The column list is available at any time through the DataTable's Columns collection.

Bugs, comments, suggestions | Legal | Privacy | Advertising

ADO Limitations in the Programmable Web

ADO.NET DataSets and ADO Recordsets

Highlights of ADO.NET Objects

Highlights of ADO.NET Objects

Additional resources