Integration of Windows-based Client/Server Systems with Legacy Hosts in the Enterprise

Article
02/09/2009

By Bill Jacobi, EBC Product Manager, Microsoft Corporation

The purpose of this white paper is to discuss the various approaches that companies have taken to integrate Microsoft Windows®-based client/server systems into legacy systems. For the purposes of this paper, the focus is UNIX® and IBM® mainframes as the most typical legacy systems with which people need to interoperate. This isn't to say that customers don't have questions about integrating Windows with Wang®, UNISYS®, Honeywell and other legacy systems but the focus here is on the "middle majority" of systems that customers have brought to Microsoft's attention for integration help and expertise.

The organization of this paper is designed around a set of scenarios that describe the architectural approaches used with different types of integration. The purpose is to show several approaches that developers and enterprise customers have used successfully in the past. These approaches, in turn, will be looked at and will become the basis for the second part of the paper which compares and contrasts the relative benefits of each approach for legacy integration. While the particular data and systems being integrated will drive the particulars of any solution, there are some general statements which one can make regarding each approach.

There are several white papers from Microsoft that describe how our tools are used for client/server development1. The focus of this paper is how systems that have been built with these tools can interoperate with legacy systems.

Why Interoperate?

The answer to this question is two-fold: real business value remains in existing production systems, and the value of Microsoft software is increased when these legacy systems are tied to a standard desktops and standard servers, and (2) there is functionality provided by many legacy systems that for a variety of reasons is not provided by Microsoft products that, nonetheless, is important and critical to customers.

Business Value

In some cases the data and in other cases the processing on legacy systems provides significant business value. It would be hard to overestimate the value of the reservation data contained in the computers used by Hyatt hotels and resorts. Without reservation information, this business simply couldn't function. IBM mainframe systems allow Hyatt customers to call from anywhere in the world and make reservations for Hyatt hotels anywhere in the world. Clearly, this data has business value to Hyatt. Another example concerns Boeing Computer Corporation. There are over 1 million parts on a commercial airliner. Over time, each of these unique parts are upgraded and enhanced. Boeing needs to track these changes to maintain and guarantee the integrity of its aircraft and does so with a host-based RDBMS (Relational Database Management System). These are just two examples of the business value of data stored in legacy systems. Processing payroll records is an example where both personnel data and the processing that occurs on host systems is a critical business function.

On a daily basis, Microsoft conducts briefings with our largest customers and hears the value of the data and business processes supplied by legacy systems. Clearly, one way for Microsoft to add value to our customers is to show how Microsoft commodity tools and suites and their knowledgeable deployments can make accessing legacy systems easier, less expensive, more reliable, and faster.

High-End Functionality

Microsoft recognizes the unique benefits of many legacy systems. For example, it is clear from customer testimonials that IBM's SNA (System Network Architecture) provides a highly reliable wide-area communications infrastructure. While a cluster of Windows NT®-based servers could conceivably provide the total MIPS2 used in an airline reservations system mainframe, the SNA communications infrastructure and its strengths has yet to be fully duplicated in the LAN/WAN world.

Another example of high-end functionality concerns supercomputers. Supercomputers that simulate the flow of particles over an airplane wing do not have equivalents in the PC space. While the attraction of PCs is the promise that more and more hard problems can be solved inexpensively, today's legacy systems provide functionality that solves real customer problems. As legacy mainframe and UNIX-based systems become cheaper, it becomes both a technical and business decision deciding whether to continue development on the legacy system, reengineer to a client-server solution, or, as this paper discusses, integrating client-server systems with legacy systems—the goal being to take advantage of the best of each technology and leverage code that currently serves a business purpose.

In some cases, interoperability as a strategy may not make sense. When companies are spending unreasonable amounts of money maintaining legacy systems and when the value of the legacy systems is not equal to the ongoing costs of the system, it may make sense to downsize and reengineer the application to a more cost-effective system. Today, client-server methodology, tools, and techniques are generally considered mature enough to be viable alternatives to host-based computing when business benefits of these systems don't justify the costs. However, completely reengineering a working business application may represent a significant risk. Thus, even if the best technology for designing an application today might be client-server, there may still be substantial benefits to integrating legacy apps with client-server rather than starting from scratch. This paper will examine the different ways in which one can approach interoperating and combining the value of legacy systems with Windows-based systems.

Scenarios

The following four scenarios describe several of approaches that an account may take to integrate Windows-based solutions with legacy systems. The four approaches are:

Enhanced Terminal Emulation Techniques
Internet Client Legacy Integration
Middle Tier Server Architectures, and
Host API (Application Programming Interface) Support

Enhanced Terminal Emulation techniques focus on enhancing the basic terminal emulation of Windows®-based clients that communicate with hosts. By putting scripts on the client that simulate user action, client systems can automate repetitive, slow and cumbersome client-host dialogs. Perhaps more significantly, by soliciting the user for information using Windows-based front-ends, users often report two benefits: (1) legacy access appears more "user-friendly" resulting in increased productivity, and, (2) applications perform faster (or appear to perform faster) since input is requested and validated locally3. User-supplied information to a Windows-based front-end program, in turn, can be formatted and sent to the host in the format the host expects. Terminal emulation techniques can be enhanced further with a technology called "screen-scraping" on Windows front-ends. This technique and other enhancements to terminal emulation are explored further below.

Internet Client Legacy Integration takes advantage of the fact that increasingly both mainframe and UNIX systems support Web browsers and Web protocols for system access. Since Web browsers and Web protocols4 have become common in the computing industry and since these browsers typically are "lightweight" from the perspective of use of system resources5, browser-based applications offer the potential to decrease the cost of the client that accesses legacy systems and to increase the sheer number of systems that are enabled for legacy access. Below, this paper talks about different capabilities of Internet clients and what can be accomplished in Web browsers versus client applications that use Internet protocols for communicating with legacy hosts.

Middle Tier Server Architectures are based on the premise that costs can be reduced by consolidating the communications lines and client configurations that let each client individually communicate with a host. By injecting a "middle tier server" between the clients and host, special purpose emulation boards and terminals can be eliminated on the clients, and, accordingly, the middle tier host can take advantage of higher speed, more robust connections to the legacy system on behalf of its clients. Thus, the cost of adding a server is believed to outweigh the costs of direct communication by each client. This will be explored further below.

Host API (Application Programming Interface) Support takes advantage of the fact that GUI-based6 PC clients can communicate with the host in intelligent ways rather than simply via terminal protocols. Microsoft and the industry has developed several special purpose protocols for interacting with databases, transaction processing monitors, and with messaging systems. This section of the paper looks at the relative benefits of creating Windows®-based clients (or middle tier servers) that communicate with the host using these special purpose protocols. The goal of this approach is to eliminate the overhead inherent in supporting a general set of terminal functions and increase efficiency by making specific calls to a database or transaction processing system, such as CICS. By eliminating the need for terminal protocols, the client application owns 100% of the user interface, which plays into the strength of PCs as interactive devices. This is the last scenario discussed below.

These approaches aren't mutually exclusive. For example, Internet Clients can use Host APIs to access functionality. There are many other "combination approaches" of these and other techniques. However, by understanding the pros and cons of each pure approach, one may intelligently extrapolate the pros and cons of combined approaches.

Enhanced Terminal Emulation Techniques

In the simplest case, Windows®-based clients are tied to UNIX/Mainframe backends through the client support of terminal emulation. For example, there are a variety of products that allow a Windows®-based client to emulate a 3270 terminal7 as the native terminal an IBM S/370 system expects. Similarly, there are a variety of products that allow a Windows®-based client to emulate VT100, ANSI or X-Windows terminals8, the native graphics terminals for UNIX systems. Vendors who offer products in these categories typically compete by the level of compatibility and capability that they provide on the client. For example, 3270 emulators differ on whether they emulate a text-only or a graphics terminal, and on the types of terminal mode that they support. Similarly, VT100 and higher VT model emulators provide increasing sophistication of terminal support along the text-graphics continuum.

Terminal emulation, the most basic form of integration, requires that 100% of the legacy program is running on the backend machine with the client only emulating terminal protocols.

Terminal emulation techniques have evolved to support more sophisticated capabilities, typically by branching in one of three directions: by adding client support either for (1) protocol-specific script languages, (2) programmatic support for screen reading and writing, known as "screen-scraping", and (3) support for higher level object and API interfaces.

Script Languages

The purpose behind script languages is to add a programming capability to the client which can drive the emulation interface. These languages typically send a string, wait for a response, and can branch depending on the words returned to the screen. As is common to programming languages, these script languages support variables and limited abilities to detect the environment. For example, they may be able to read variables that have been initialized through operating system SET commands. In the IBM world, one can use languages such as HLLAPI (High Level Language Application Programming Interface) or EHLLAPI (Extended HLLAPI) which are slightly more sophisticated tools than simple keystroke stackers. Generally, HLLAPI and EHLLAPI are not used because they are considered rudimentary in favor of the tools below.

REXX, which is a sophisticated script language for IBM environments, and shell scripts (sh, csh, bash, ksh, PERL)9 which are system-wide script languages for UNIX systems offer more power than the protocol-specific script languages like HLLAPI because they were originally designed as general purpose languages for system administration. As such, these languages natively support many more capabilities. Client/server systems have developed out of these languages once the vendors have ported the host version of the language to the client. For example, REXX clients on the Windows Operating System can pass commands to the host over the terminal protocol and, in turn, receive the return codes from host commands.

A key element of these script languages is file transfer. For example, a REXX program can issue a special copy command to request that a file be downloaded from the mainframe to the user's hard drive. To support file transfer, UNIX systems typically require the user to run PC-NFS, which makes the UNIX file system appear to the user as either another local drive or a subdirectory of the user's C drive. Thus file transfer to/from the UNIX host appears like local movement of files between directories. There are many third-party UNIX shell implementations that run on Windows®-based clients10. Rather than simply reading return codes from a host-executed command, some client-side UNIX shell scripts include a syntax to allow client scripts to "pipe" information directly to host scripts and vice-versa11.

Screen-Scraping

Screen-scraping is the term used to programmatically send data to (and read data from) specific locations on the screen in a session with a UNIX or IBM host. The advantage of screen-scraping is that it allows clients to interact with host programs which simply expect data to be filled in at field locations on the screen. Screen-scraping is usually a capability that is added to some other scripting language, like the Microsoft Visual Basic programming system. Thus, a client program, through Visual Basic, can create its own user interface, and, through screen-scraping, present host data to the user in an entirely different way than was originally conceived by the host programmer. Screen-scraping solves another problem as well: it allows interaction with host programs that don't generate return codes. REXX and many other script languages are hard-coded to look in specific locations for return codes to commands that were issued. But programs that are running, like many COBOL programs, are interactive and don't generate return codes. Many third party companies that support screen-scraping for 3270 environments have versions for UNIX as well12.

Object and API Interfaces

Terminal emulator products typically come with a proprietary interface allowing a C programmer access to buffers, command arguments, board settings, etc. These interfaces have not been used by application programmers because they are typically vendor-specific, arcane, and designed for the developer of systems software. On the other hand, an increasing number of tools and developer products come with object-oriented interfaces for client execution of high-level services to reach legacy data. Typically these products abstract the underlying emulation cards and network protocols so that the developer isn't concerned with the specifics of the connection hardware.

For example, Attachmate provides a set of ODBC drivers that connect to IBM DB2® and SQL/DS™ that are not tied to specific 3270 emulation card products. Attachmate sells a set of Visual Basic and OLE custom controls that provide high level access to legacy machines, providing functions such as login, file transfer, wait for text, etc. Another company, Vmark Software, has created a product called HyperStar Object Messaging Middleware which has implemented ODBC13 and high level services on top many vendors' 3270 emulation cards and LAN protocols. Intersolv makes a product, Virtual Data Warehouse, which ships with a variety middleware drivers (called SequeLink drivers) that support multiple types of connectivity to backend databases. In turn, Virtual Data Warehouse's development tools and end-user products operate similarly across environments. Wall Data ships a product called Rumba Objects, which provide OLE controls to create 3270 or UNIX terminal windows, file transfer, application pasting, and printing. These objects come in separate versions for S/370 mainframes, AS/400s®, and UNIX systems.

There is a large variety of tools which provide varying degrees of object-orientation for legacy data access. The tools include products such as Smalltalk environments, code generators, report creators, visual design toolsets, team development, "business object" creators, etc. These object-oriented interfaces support almost exclusively the Microsoft VBX and OCX conventions, object extensions based on Visual Basic or OLE technology. Many of these products provide a call-level API to achieve similar functionality for C and C++ programmers. The best source of data about these products typically is from programmer catalogs like Fawcette's Component Objects and Companion Products14 and Hotlinx' Putting Client/Server to Work15.

Open Database Connectivity (ODBC) is one of the best examples of an API that has been layered over communications protocols to create links to legacy databases. ODBC represents an abstraction layer in which Windows®-based clients can request data, query meta-data properties (such as whether indexes exist), and issue database definition language (DDL) updates to a database. Many companies, such as Starware Inc's StarSQL, provide extremely high performance legacy integration to Windows®-based clients through the use of ODBC and SNA Server. On the client, ODBC calls are efficiently translated into IBM's Distributed Relational Database Architecture (DRDA) formatted requests for remote DB2 data.

Internet Client Legacy Integration

Internet client support means running one of the Windows browsers or increasingly a particular client (such as a word processor or desktop database) that supports HTML rendering and HTTP linking. Because browsers represent a display and navigational interface over TCP/IP, many legacy UNIX and mainframe systems can support browsers by supporting TCP/IP and by generating HTML text. TCP/IP has always been the native communication protocol for UNIX systems, and IBM has supported TCP/IP on both MVS and VM for many years now. It has only been since the first quarter, 1996, that IBM has supported HTTP server functions, making the mainframe a Web Server. The key technology component for supporting legacy business applications via a browser is not the Web Server function. This is simply the function of listening for a particular type of TCP/IP conversation and returning HTML text over this link. The important piece of technology is the application that converts legacy data into HTML as content for the Web server.

Legacy system access in an Internet Client can mean many things. I'll talk about 3 types of access: (1) Microsoft's Internet Database Connector and SQL Server Web Assistant, (2) Active-X Controls and Java™ Programs, and (3) CGI and ISAPI technologies.

Microsoft Internet Database Connector and SQL Server Web Assistant

While it is certainly straightforward for a server to maintain many static HTML pages, it has been fairly recent that Web Servers have provided access to legacy databases from a browser. Microsoft's Internet Information Server (IIS) product, a Web server that ships with Windows NT Server or Workstation, supports connections to legacy data through a component called the Internet Database Connector (IDC). The IDC supports connections to any backend database by user-selection of the desired database from a list16. This is a list of ODBC databases created by the installation of ODBC drivers on the user's PC. Microsoft's SNA Server product provides ODBC drivers to legacy systems, such as IBM DB2. Using a browser, the user accesses and fills in an HTML form, whose contents are converted into an SQL query by IIS. IIS, in turn, uses SNA server to access the legacy system database.17

Prior to the use of the IDC in Windows NT Server, developers of Web servers were required to write CGI scripts and Visual Basic or C programs to access databases. This process worked as follows: The user filled out an HTML form and clicked a Submit button. The server would write the data to an INI file that resided on the server. Next, the server would execute a Visual Basic (or C) program18 passing in the name of the INI file which was just created containing the query parameters. The Visual Basic (or C) application next would explicitly open the database, execute the query, close the database, and return the HTML result set to the server, which, in turn, passed it to the client. In other words, the developer had to inject a full-fledged data access program on the server for any data access. This has been viewed as relatively inefficient, and has been the technical basis for Microsoft developing a server-side technology called ISAPI (Internet Server API).

The SQL Server Web Assistant is a step-by-step "wizard" which prompts the SQL Server Administrator to define a set of data (via a query or stored procedure) whose results will be published on the Windows NT Web Server (IIS). The Administrator can choose whether data should be republished every time it changes, or at scheduled intervals. Microsoft's license agreement allows access for unlimited number of users to the data with the SQL Server Enterprise Edition.

ActiveX controls and Java Programs

ActiveX controls refers both to server and client-side objects that act like pre-created building blocks of code. These "building blocks" can be assembled without prior knowledge of the preceding or subsequent ActiveX objects or applications. On the server side, the ActiveX Server Framework (based on ISAPI, as mentioned above) is a set of extensions which are used for Windows NT Web Servers to provide "plug-in", modular support for technologies such as encryption, compression, authentication, logging, transactions, etc. ISAPI is really just a set of programming interfaces for providing common functions needed by Web servers.

On the client side, ActiveX controls are similar building block objects that can be inserted into client applications that are so-called COM-aware. ActiveX objects provide capabilities such as document viewers, data-aware grids, multimedia players, and other independent bodies of code that discretely add functions to an application. Because ActiveX controls represent a technical evolution of the OLE custom control specification, ActiveX controls can, in fact, be anything that previously existed as an OLE custom control (sometimes known by its file extension, OCX) or a Visual Basic custom control (also known by VBX). The development tools market supports a rich set of such data-aware and terminal emulation controls. These controls, in turn, typically make calls to the ODBC, 3270, or NFS interfaces, discussed previously. Because of the existence of a variety of ODBC drivers to legacy systems, many companies promote their ActiveX controls as ways to facilitate legacy system integration.

Java™ is an object-oriented programming language designed for creating "applets" and full-fledged applications for the Internet. Microsoft is supporting both fully compiled Java™ applications as well as Java™ Script as an embedded set of script commands that can be run from a Web page. (Actually, JavaScript™ and Java™ do not share a common language. The similarity was created by the marketing department at Netscape.) Microsoft is extending Java™ to both activate and create ActiveX objects, providing a rich object layer and an industry of pre-written controls for Java™ developers. Jakarta, the Microsoft code name for Microsoft's Java™ environment, Java™ script, and the development environment, will support the same high end development and team features found in Visual C++® Professional Edition. Currently, Sun Microsystems is evolving Java™ 's data access capabilities, and Java™ is expected sometime in 1996 to pick up an "ODBC bridge" (sometimes called JDBC) through which Java™ applications will have access to legacy data. Microsoft is writing the reference implementation of Java™ for Windows and thus is working closely with Sun Microsystems on Java™ 's latest capabilities.

CGI and ISAPI Technologies

Common Gateway Interface (CGI) provides a means through which a Web Server can invoke a Visual Basic or C program. With the advent of Microsoft's IDC described above and other vendors' similar technology, CGI is being used less and less frequently for data access in favor of more "automated" solutions. Where CGI is used for data access, including legacy data access, it is used in much the same way a developer would use C or C++. That is, CGI doesn't have any specific database-aware capabilities that help the client-server programmer. Rather the C programmer programs specifically to libraries that come with the emulator or network product. CGI was the first way (historically) for Web applications to reach out to legacy systems.

ISAPI is an API for extending the server functionality of Internet Information Server. Besides the common functions mentioned above (compression, authentication, translation, etc.), one can use ISAPI to create server extensions that act like native system services. For example, a Microsoft consultant wrote an ISAPI filter which looks in a particular directory for Word files. Whenever it finds them, his ISAPI program (known as a filter) automatically converts them to HTML and publishes them on a Web page. A developer shop which wanted to tightly integrate Web applications from the Web Server to a legacy system would write ISAPI applications that would either interact with the host over terminal emulation or via SNA Server. SNA Server, in turn, provides programming libraries that automate many common tasks in legacy access (such as logon support, etc).

Middle-Tier Server Support

An ISAPI application as mentioned above represents a "middle-tier" server solution. Increasingly, developers are recognizing the benefits of partitioning applications into multiple tiers. The insight that has driven this evolution has been the recognition that the maintenance of programs has been made more difficult by the convoluted, interrelated nature of so-called monolithic code. In traditional host-based code, the application logic, data-access, screen-design, and data validation would all occur in the same stream of code. Independent "plug-in" components, by contrast, would modularize the code and reduce maintenance. If programs could be modular (which means providing discrete functionality using agreed-upon, standard interfaces), then maintenance to one module would not affect the remaining body of code. The drive towards modularity and clear delineation of tiers supporting business rules, data validation, screen design, etc, is leading towards a component-based computing model, in which self-contained components perform their functions without external dependencies. Ultimately, when industry has brought about a very rich set of components, writing software may in some cases more closely resemble building structures from different types of Lego blocks than today's skilled use of language semantics and logic.

Scenario 1, in which each Windows®-based client accesses legacy data though some form of terminal emulation, is an expensive architecture from the perspective of host resources. The legacy system has to manage all of the connections to the system, even currently idle connections. The legacy system is responsible for all of the screen painting of each client. Further increasing cost, terminal support occurs via polling of terminal control units. Polling architectures are notoriously inefficient.

Let's consider a "downsized" two-tier example, as a prelude to a discussion of the efficiencies of a three-tier model. In the two-tier case, consider Microsoft Access clients that access data from SQL Server. A typical occurrence in database applications is the need to add new queries that end-users or managers can run. In the two-tier model, adding new queries means updating the software that is running on every PC client because this is where the queries are stored. Clearly, this is not necessarily a clear (if any) advancement over the host model. It is expensive for an organization to write, test, and deploy software to multiple clients.

A recognition of the deficiencies of the two-tier and host-based model has led to the development of applications in which business rules (such as the list of predefined queries) reside in a separate location, independent of both the client and server. In Visual Basic, these business rules are typically created as OLE Servers, which are small programs—not unlike DLLs-- that run on the server in a separate process from the database system. The client calls the OLE Server to access business rules. In this particular case, the OLE Server may return a list of all precreated stored procedures for a database. When a new user-defined or IS-defined stored procedure is created, the user automatically sees the new predefined query because the OLE Server always returns the list of stored procedures. The beauty of abstracting what the user sees through an intermediate "business rules" tier is that none of the client code needs to be changed, yet the client can be fundamentally extended.

The downside to the flexibility and decreased maintenance of the three-tier model is that it involves more effort initially to develop a three tier system. Programmers who are good at writing Visual Basic, Microsoft Access, or C code may not immediately want to learn how to write OLE servers, though they are created very simply in languages like Visual Basic without any external calls to Windows itself. Nonetheless, learning any new skill represents competition for the developer's limited time.

To access legacy data, Microsoft SNA server acts as a middle tier. Windows®-based clients that communicate over TCP/IP to SNA Server share a small set of SNA connections to the mainframe. Transaction Processing Monitors similarly manage the creation and destruction of sessions to UNIX and mainframe databases. Many mainframe shops have recognized the benefits of issuing CICS transactions without requiring clients to use a 3270 emulation architecture. This has led to products like UniKix Technologies UniKix 4.1, which provides the ability to either port or run CICS applications on UNIX machines, or use UNIX machines as a central broker of CICS transactions for PC clients. IBM has ported CICS (client and server APIs) to the Macintosh®, Windows NT, Windows 3.x, OS/2®, and AIX, and is currently adding a C++ set of class libraries to these platforms to make CICS more readily accessible to desktop applications.

In the last quarter of 1995, Microsoft endorsed the Middle Tier Server Approach with the purchase of Netwise Corporation and the TransAccess product line. TransAccess provides the ability for client programs written in Visual Basic, C, or Cobol to make calls to an Windows NT Server which, in turn, uses COM interfaces to create a wrapper for invoking CICS transactions. This approach minimizes host system resource requirements because the host has few connections and is only concerned with running transactions while clients gain a very efficient network and programming interface to the legacy data. As the TransAccess product supports legacy systems through a middle tier, it is not unreasonable in the future for clients to have direct OLE access to products like CICS. As described above, there are tradeoffs in creating two, three, and N-tiered solutions.

Host API Support

The final scenario that some development shops may find appealing is writing custom applications that make calls to the most efficient shared APIs on both client and server. For example, for new applications, IBM's Advanced Peer to Peer Communication (APPC) protocol is the fastest and least restrictive protocol for moving data between clients and servers, and IBM has provided APPC documentation and call libraries for Windows®-based clients. Similarly, a UNIX shop concerned with the highest possible performance between Windows®-based clients and a UNIX RDBMS may choose to write client programs in C that make calls to so called raw sockets on the PC which define the most efficient packet size for the UNIX host with which it is communicating. Sockets and APPC represent two similarly low-level interprocess communications mechanisms. Some development shops may be attracted to a suite of protocols like Distributed Computing Environment (DCE) which specifies many more common services between client and server (directory, security, time, etc.) Although DCE services are low level, some companies have viewed the beauty of DCE in the fact that it provides a large set of protocols that theoretically would run the same way in another vendor's DCE environment. For development shops and Enterprise customers that are concerned either with the utmost performance or ability to control and tune the most specialized aspects of client-server, programming to common APIs may potentially be the best way to go.

Comparison of Approaches

There are three key tradeoffs that managers in enterprise accounts must consider when choosing the best scenario for integrating Windows desktops with legacy systems: (1) system resource utilization, (2) developer effort, and (3) system flexibility.

While terminal emulation techniques are in many ways the easiest, and the refinements to terminal emulation can be made in a stepwise fashion, ultimately this approach places the heaviest burden on host resources, which must manage connections, processing, and client screen painting. Thus, terminal emulation may be a good approach if there are excess cycles on mainframes or if the reduction in some mainframe applications leaves an underutilized system. Screen scraping represents a "low-tech" but very effective way to tie Windows®-based clients to the legacy system and doesn't place especially tough burdens on developers who can use this technology from languages like Visual Basic. Ultimately, screen-scraping is not that flexible because the developer is limited to only the functions that currently exist on the legacy system. Additionally, if the legacy system requires many screens between functions being automated on a Windows®-based client, the Windows®-based client has no choice but to screen-paint the choices since this is the only interaction the program has with the host. For these reasons, screen-painting is usually viewed as an inexpensive, tactical method of legacy integration.

Internet Client Legacy Integration is the fastest moving technology of all those described in this paper. For this reason, it is the hardest to make generalizations that will be true in three months. Generally speaking, most browsers of today are only capable of issuing HTTP and reading the corresponding HTML. This forms a very limited basis for legacy system integration, except as legacy systems use special purpose code to create HTML for legacy data. IBM and many other companies are writing mainframe applications so that they have the capability of outputting HTML which would allow this scenario to work. In the near future, browsers will be able to host controls and full-fledged applications that will directly tie the browser to the legacy system, either in emulation mode or using APIs such as ODBC. In this world, the advanced browser world, it's not clear that there is a large distinction between an application that runs in the OS and one that runs in the browser. The distinction will be even further blurred as Windows operating systems natively support browser functions.

From the point of view of system resource utilization, browser-based applications offload the display and terminal protocol requirements from host systems, and, in that sense, are efficient. The developer effort in this scenario lies in the writing of legacy applications to create HTML, as the browser becomes a very light client. It is not necessarily straight-forward to rewrite legacy applications and change their output to HTML. In fact, most large companies are taking resources away from legacy development, not increasing them. If host applications are rewritten to output data in HTML, this solution still doesn't necessarily increase the flexibility of the system. Adding a new query to a host-based application that outputs HTML may still require writing host-based COBOL queries whose results are written in HTML. Thus, from the perspective of flexibility, it's not clear that making an application Internet aware is automatically the right decision. On the other hand, if a corporation wants to make its application available to the public over the Internet, almost any customer can become a client via the browser. This advantage is not available to any of the other techniques.

Middle or (N-Tier) server architectures represent an interesting trend. From the perspective of system resources, middle and N-tier solutions allow for the most efficient partitioning of an application so that the pieces run in the places that make absolutely the most sense. For example, putting zip code tables onto the client may save enormous amounts of network traffic if every screen of an application requires validating zip codes. If you multiply the application by a reasonable number of users, partitioning the application shows the greatest benefits. The amount of developer effort that's required to create N-tier solutions has been debated in the computer industry. In the past, writing server objects has been viewed as difficult enough (unless the server objects were simply stored procedures) that this approach was not widely adopted even though the benefits of application partitioning were well known. With a new set of GUI, RAD tools such as Visual Basic and PowerBuilder™ which can create OLE servers, this approach has been gaining momentum. Since component-based development offers significant advantages in modularity and code independence, this approach ranks very highly in terms of system flexibility.

A particularly attractive trend in the middle tier server approach is the trend towards having more legacy integration directly built into the middle tier. For example, SNA Server's middle tier support of IBM mainframes makes this a cost-effective solution. Similarly, the TransAccess product appears to make it quite straightforward to bring CICS to Windows®-based clients.

The Host API approach has both positive and negative elements. From the perspective of system resource utilization, this approach potentially incurs the least overhead. If the legacy system provides a client-side library for directly accessing host resources, this approach requires the least "infrastructure". This may be the case with database applications that run on the host and can take advantage of ODBC on the client. On the other hand, most corporate developers do not have the skillset to program to APIs at the level of IBM SNA and APPC. Similarly, even good C programmers have trouble writing network programs. Thus, the Host-API approach is very dependent on which Host APIs are being written to. Except specifically for database applications, this approach is probably not the best for the typical IT shop. Nonetheless, this approach provides the ultimate in system flexibility. By writing code for all 7 layers of the OSI model, a development shop has complete control over every aspect of the software. This may be important in writing code which operates life-critical systems such as Space Shuttles or weapons systems. Generally, however, the industry has been moving steadily towards providing increasing layers of abstraction, freeing the typical programmer from having to know system details.

Summary

This paper has attempted to discuss legacy integration through the use of 4 architectural scenarios. Each of the scenarios characterizes a way to facilitate legacy integration. Importantly, each scenario represents a set of business and technical tradeoffs that must be carefully weighed. While this paper compared the approaches from the perspective of system resource utilization, developer effort, and system flexibility, these criteria are ways to look at the architectural costs and benefits of each approach. To implement any of these approaches requires being aware of additional issues, including administration, manageability, and overall cost of ownership. These aspects weren't specifically addressed in this paper because these aspects are highly dependent on specific implementations. On the other hand, the most successful enterprise development efforts have "baked in" software components to promote manageability, performance monitoring, and administration. None of the approaches in this paper are inherently more or less manageable, administrable, or costly over a project's life cycle. However, attention to manageability issues is increasingly important as projects increase in scope, complexity, and targeted user population.

Another important issue that is relevant across all of these scenarios is the use of business process modeling and software development methodology. There are a variety of methodologies that have been developed both to capture the business processes in an organization as well as to manage the development process itself. It is crucial that companies understand the impact of management on the software development process. The number one reason cited for software project failures is the lack of effective management.

The promise of legacy integration is that the business benefits derived from existing production systems can be married to and leveraged with the today's tools and platforms. Leveraging legacy applications with a new set of tools promises faster response time, lower costs, easier maintainability, and greater end-user productivity. In the final analysis, legacy system integration is increasingly viewed as a competitive weapon that can provide closer ties to customers, faster response time, and lower total costs of business.

Microsoft, Visual Basic, Visual C++, Windows and Windows NT are registered trademarks and ActiveX is a trademark of Microsoft Corporation.

Other trademarks or tradenames mentioned herein are the property of their respective owners.

The information contained in this document represents the current view of Microsoft Corporation on the issues discussed as of the date of publication. Because Microsoft must respond to changing market conditions, it should not be interpreted to be a commitment on the part of Microsoft, and Microsoft cannot guarantee the accuracy of any information presented after the date of publication.

This document is for informational purposes only. MICROSOFT MAKES NO WARRANTIES, EXPRESS OR IMPLIED, IN THIS DOCUMENT.

1 See https://www.microsoft.com/vbasic, www.microsoft.com/devonly, and www.microsoft.com/ie

2 Millions Instructions per second.

3 Data entry and validation on a PC will appear to have consistent response time. Several studies have shown that within a set of expectations consistent response time is perceived to result in better overall performance than variable response time—even when the average of variable response times is lower than the consistent case.

4 HTTP (Hypertext Transport Protocol) and HTML (Hypertext Markup Language)

5 There is increasing debate in the industry about how lightweight Browsers will ultimately become. As browsers gain more functionality, they put increasing demands and require increasing resources on the systems they run on. Browsers are being designed to be modular in the sense that their demands on system resources is exactly proportionate to the functionality used by the browser. That is, the design of browsers aims to avoid loading functionality that might be required later on.

6 Graphical User Interface, i.e. graphical

7 This is a generic use of the term "3270 terminal" and refers to 3278s, 3279s, 3179Gs, etc.

8 There are X-Windows Server and Client products by Frontier Technologies, Inc, Intergraph Corporation, Digital Equipment Corporation, and others. To see a list of products, please consult www.microsoft.com/ntserver as well as www.microsoft.com/infosource.

9 In recent years, PERL has emerged as the premier scripting language on UNIX platforms. PERL has consolidated the features of many standalone UNIX commands into a single language. PERL has become widely accepted across the UNIX community. As a language for system administration, it has been ported to many other systems, including Windows NT. Reference https://www.perl.hip.com

10 Hamilton C Shell and Mortice Kern Systems (MKS) PC-UNIX. Please see references above for additional company information.

11 "Pipes" are a UNIX feature (which is supported to a limited extent in DOS and Windows NT) that allows one program's text output to be sent to another program for subsequent processing. Pipes are a common interprocess communication mechanism.

12 Rumba Corporation and Attachmate Corporation

13 Open Database Connectivity, a standard for allowing cross-platform clients to use a common programming interface to connect to different vendors' backend databases. ODBC is discussed in greater detail below.

14 The Spring/Summer 1996 Catalog contains 240 pages of component and Visual Basic addons. To have this catalog sent, one can reach Fawcette Technical Publications at 800-848-5523, or https://www.windx.com

15 To order this catalog, call 919-783-9184, or access https://db.hotlinx.com

16 The administrator of Windows NT Server installs ODBC drivers, any of which can be a valid database for the Internet Database Connector (IDC). In configuring the IDC, the Administrator chooses any ODBC driver from the list of installed "data sources".

17 Under the covers, the ODBC call is converted into IBM DRDA (Distributed Relational Database Architecture) format, which is a native IBM database format for DB2 and SQL/DS.

18 Or PERL script