Replication

Article
01/28/2010

Updated : July 19, 2001

John Lynn

Chapter 9 from SQL Server System Administration, published by New Riders Publishing

Easy replication of information has always been an intriguing capability among system administrators as a means of providing essentials such as backups for disaster recovery and distributed information. Replication systems for databases have particularly taken off in recent years as people discover new ways of using automated copies of all sorts of data. Everything from electronic mail systems and document management to data warehouse and decision support systems readily benefit from a good replication scheme. Over the years, replication has become available in vendor products ranging from simple desktop replication (like what we see now in Microsoft Access 97) to high-capacity multisite hot backup systems.

In most database server systems, replication was packaged as a separate, extra-cost option; Microsoft SQL Server started including replication as part of the product with release 6.0, and most Relational Database Management System (RDBMS) vendors have seen fit to follow suit.

When SQL Server database replication was introduced in SQL Server 6.0 and enhanced in release 6.5, replication was defined in terms of publishing, distributing, and subscribing; this paradigm continues in SQL Server 7. Before delving into the enhancements of SQL Server 7 replication, review the simple model used by SQL Server 6.5:

A publication consists of one or more articles.
Articles are simply database tables with SQL SELECT logic optionally applied to select a subset of rows and columns to publish.
When a destination server agrees to subscribe to a publication, an initial synchronization step ensures that the destination database matches the publication's schema column and data types and that the data is in sync to start out with.
After the initial synchronization step, changes that are made to table rows and columns that are part of a publication are detected and saved in a distribution database to be applied to the destination subscriber.
After a set number of transactions or at a set time schedule, the changes are read from the distribution database, and commands are executed to cause the destination (subscriber) to be brought up-to-date with the source.

Armed with this understanding of how things were in SQL Server 6.5, you're ready to see how SQL Server 7 builds upon this design. The main topics to be covered in this chapter include the following:

Top features
Setup and configuration
Replication agents
Snapshot replication
Transactional replication
The Immediate Updating Subscribers option of snapshot and transactional replication
Merge replication
Replication and the new SQL Server security roles
Subscribing

Microsoft SQL Server Replication Is Not Asbestos

For the most part, SQL Server 6.5 replication was a pain to work with, especially as the number of transactions to be replicated grew. It would get fussy and unstable and sometimes stop working altogether without warning, prompting manual intervention and creative detective work to get things moving again.

In addition, the replication model used by SQL Server 6.5 did not easily serve all business needs. The goal of SQL Server 6.5 replication was that of transactional consistency: that changes made to tables in a publication would eventually be put into place at the subscriber. The subscriber was definitely at the receiving end. The situation got complicated when the people using the subscriber wanted to update their pieces of the publication and have the changes reflected back to the source, which is a perfectly reasonable and desirable thing to want.

In this chapter, you'll see how SQL Server 7 addresses this and many other common business needs. Replication in SQL Server 7 promises to be much more stable, easy to set up and use, and easier to troubleshoot.
Monitoring replication
Advanced replication topics

First, take a minute to examine some new features of SQL Server 7 replication.

Top Features
Setup and Configuration
Replication Agents
Snapshot Replication
Transactional Replication
The Immediate Updating Subscribers Option of Snapshot and Transactional Replication
Merge Replication
Replication and the New SQL Server Security Roles
Subscribing
Monitoring Replication
Advanced Replication Topics
Conclusions

Top Features

Although the list of new features introduced by SQL Server 7 replication is long, several features really stand out:

Wizards
New replication types
Update replication
Anonymous subscribers
Publishing the Execution of Transact-SQL stored procedures
Scripting the replication topology
New SQL Server agents for replication tasks
Better support for heterogeneous replication
Replication ActiveX controls

The following subsections introduce these features. Each will be covered in greater detail later in the chapter.

Wizards Make It Easier

A common theme throughout all facets of administering replication with SQL Server 7 is the presence of the step-by-step wizard interface to reduce even the most complex setup tasks to a manageable level. This will be apparent throughout this chapter.

The outcome of these wizards is the execution of stored procedures that do the actual work of creating the replication, so it's possible to manage replication without using the wizards. In addition, because replication setup is now completely scriptable, scripts can be generated automatically from an existing configuration. This will be covered in the section "Scripting the Replication Topology."

New Replication Types

SQL Server 7 replication offers 3 types of replication, each designed to solve a particular set of challenges:

Snapshot
Transactional
Merge

Snapshot replication performs a complete refresh of the destination at scheduled intervals. Snapshot replication was available in SQL Server 6.5 as a "Scheduled Table Refresh," a sort of repeating automatic synchronization.

Transactional replication detects changes made at the source and applies the same changes in the same order at the destination. SQL Server 6.5 also provided this replication type.

Merge replication intelligently analyzes changes made at both the source and destination and applies the changes to whichever target is appropriate. It uses a conflict algorithm provided by SQL Server 7, but if you prefer, you can tailor your own conflict-detection scheme with either stored procedures or COM components.

Each of these replication types is discussed in greater detail later in the chapter.

Update Replication

It sounds like a new type of replication, but update replication refers to the concept of enabling subscribers to modify data and have their modifications applied at the publishing server, where they are in turn sent to all other subscribers. While some shops have rigged homemade solutions in the past to get the same effect, update replication now comes in the box.

All three replication types support update replication:

Snapshot and transactional replication optionally support update replication by using a method called Immediate Updating Subscribers.

This option allows subscriber tables to be updated. The replication uses the Two-Phase Commit (2PC) protocol to ensure that the update at the subscriber end is applied to both the subscriber and the publisher, in an all-or-nothing transaction. If the transaction succeeds, the record has been updated in both places and will be part of the next replication to all other subscribers of the affected publication. Note that this option generally requires good communication links be-tween the subscriber and publisher because an update is handled synchronously in this case.
Merge replication is a new type of replication that is designed to enable updates by subscribers by default, even if there is a large number of occasionally connected subscribers. Each time a replication takes place, an exchange occurs between the publisher and the subscriber; updates that have occurred since the last replication are loaded down to the subscriber and up to the publisher. When both the publisher and the subscriber have updated the same column in the same record (a field), a conflict occurs. The conflict resolver is then called to settle the conflict, and either the updated field from the subscriber is applied to the publisher, or the updated field from the publisher is applied to the subscriber.
The default resolver uses a priority-based scheme, but it can be overridden with a custom resolver that more closely matches specialized needs.

As you learn about each replication type, look at the manner in which update replication is accomplished for that type.

Anonymous Subscribers

A routine part of SQL Server replication setup is enabling subscribers so that the subscriber is able to start using a publication. Administration can become a burden when the number of subscribers climbs into the double or triple digits, which is certainly feasible with SQL Server 7 desktop edition users and Internet subscribers. Enabling a publication to allow anonymous subscribers is an easy solution to this problem. Anonymous subscribers are covered in depth in the section "Anonymous Subscriptions," later in this chapter.

Publishing the Execution of Transact-SQL Stored Procedures

A significant new capability of replication is the capability to replicate the execution of Transact-SQL stored procedures. If you include a stored procedure as an article of a Transactional publication, the execution of that stored procedure will be replicated, not the changes made by the execution of the stored procedure. If the stored procedure causes 100,000 records in a published table to be updated, the 100,000 updates are not what's replicated; only the executed procedure is replicated after the procedure finishes successfully.

Scripting the Replication Topology

Once replication is set up on one server, it is often desirable to set up additional servers in the same way. In the past, there was no way to perform this setup without taking copious notes as you clicked your way through the setup using the graphical interface of Enterprise Manager. SQL Server 7 provides welcome relief from this drudgery with its capability to script replication topologies (to capture an existing replication setup as Transact-SQL scripts). The scripts can be used as cookie-cutter templates to set up additional servers or to restore a replication configuration as part of a business disaster recovery.

Exploiting New SQL Server Agents for Replication Tasks

Past versions of SQL Server used the SQL Executive task scheduler to manage all the job executions that carried out replication tasks. The SQL Executive performed everything from the initial synchronization to scheduled refreshes. Unfortunately, it served as a single point of failure that could cause many reliability problems. SQL Server 7 uses discrete components known as agents for each part of replication. Each step of replication is assigned its own agent, and each agent keeps a history of actions performed. Overall, the separation of tasks into specialized agents provides for better replication monitoring, troubleshooting, and performance. The different types of agents are mentioned throughout this chapter.

Better Support for Heterogeneous Replication

SQL Server 6.5 introduced the idea of replicating to non-SQL Server subscribers, such as Microsoft Access, but it never seemed to catch on because setup and operation of heterogeneous replication in SQL Server 6.5 was tricky and problematic. Because of the OLE DB support that's built into SQL Server 7, heterogeneous replication is as much a usable option as is replication to other SQL Server databases, and this is reflected in the wizards used to configure publishers and subscribers. SQL Server 7 also introduces the capability to publish from heterogeneous sources to SQL Server by interfacing with the Data Transformation Services (DTS).

Replication ActiveX Controls

SQL Server 7 packages several key elements from replication into ActiveX controls that you can use to enable your own applications to perform replication tasks. You can, for example, start a replication update to pull data to a sales force automation package written in Visual Basic, without having to interface with the publishing server in the traditional ways (using SQL-DMO objects, for example). The ActiveX controls actually externalize the functionality that SQL Server components use, enabling them to perform replication tasks themselves.

Setup and Configuration

Before a server can be used to publish database articles, it must be configured as a publisher. Each publisher also needs a distributor, a server where replicated data is temporarily stored and used by replication agents. The place to store this data is called the distribution database. The server with the distribution database is considered the distributor.

In most cases, the publisher is located on the same server as the distributor, but it doesn't have to be. In fact, sometimes you wouldn't want to use the same server for both publishing and distribution, especially if many publications are to be managed. The task of collecting and tracking transactions may have a negative effect on replication performance; in such cases, it makes sense to specify a remote distributor, using the Configure Publishing and Replication Wizard.

You can tell at a glance whether a server has been set up for replication just by looking at the unexpanded tree under the server in Enterprise Manager (see Figure 9.1). When you enable replication, or more precisely, when you enable a server to be a replication distributor, the Replication Monitor icon appears in the console tree.

You can configure a server by using the Configure Publishing and Distribution Wizard as described in the next section.

Figure 9.1: Notice that this server does not have the Replication Monitor icon, which means this server has not been configured as a distributor for replication.

Figure 9.1: Notice that this server does not have the Replication Monitor icon, which means this server has not been configured as a distributor for replication.

The Configure Publishing and Distribution Wizard

This wizard is used to configure a server as a Publisher or as a distributor to use with the publisher. If one server is to be both the distributor and the publisher, the wizard also creates a distribution database on the server.

To use the wizard, follow these steps:

Click the taskpad sequence Replicate Data and then Configure Replication; or open the Tools pull-down menu and select Wizards.
From there, expand the Replication tree to display all the replication wizards and select Configure Publishing and Distribution Wizard.

The wizard creates a distribution database and sets some preliminary conditions:

Enable publishing from the server
Allow other publishers to use the server as a distributor
Enable subscriptions from specific servers

The wizard suggests very safe defaults, such as using the same server for both publishing and distributing and allowing the server to subscribe to its own publications. Most steps in the sequence enable you to access detailed dialog boxes containing options for customizing parameters. As is typical of the wizard interface style, all selections and settings are displayed for your review. Look them over carefully before you click the Finish button.

The Enterprise Manager interface takes on a few additional entries after you use the Configure Publishing and Distribution Wizard. For one, the Replication Monitor becomes available in the Enterprise Manager tree, displaying its list of publishers, agents, and alerts. In addition, the new distribution database appears in the Databases section (see Figure 9.2).

After you configure the publisher, you must create publications of the type needed to solve the business problem at hand and then subscribe to these publications. Before you investigate the various types of replication in detail, however, read the next section, which describes the general steps for creating any type of publication.

Setup Steps Common to All Types of Replication

Creating a publication sounds complex, and in fact, it is. But before you go off looking for the three-day Microsoft class on SQL Server replication configuration, take a look at the wizards waiting to help you. Microsoft transformed a complex set of activities into a pleasant step-by-step interaction with some wizards, making replication setup a relatively painless task.

You use the Create Publication Wizard to create publications of all types:

Open the Tools menu, select Replication, and choose Create and Manage Publications. In the dialog box that appears, select the database to be published from (see Figure 9.3).
Click the Create Publication… button to start the Create Publication Wizard.
If the database you're publishing from already contains publications, the wizard offers to use one of those as a template for the publication being created (to automatically answer the questions that follow).
If this is the first publication in the database, the wizard asks you to select the publication type. Select Snapshot, Transactional, or Merge Publication.
Specify the type of subscribers that will subscribe to the publication (indicate whether there will be only SQL Server subscribers or possibly some non-SQL Server subscribers).
Select articles to add to the publication by selecting tables, and for Transactional and Snapshot publications, stored procedures.

Revealing the Distribution Database

If the distribution database is not listed, right-click on the server in Enterprise Manager and select Edit SQL Server Registration…. In the dialog box that appears, make sure that Show System Databases and System Objects is checked.

Figure 9.2: The Replication Monitor icon indicates that the server is configured for replication and is being used as a distributor.

You'll be asked to choose a publication name and description.

Then you can either accept a rather large number of defaults, or you can refuse the defaults and enter a kind of extended dialog with the wizard, in which each option is presented in turn. For the purpose of this exercise, accept the defaults. Note that all the default properties appear in the scrollable window (see Figure 9.4), where you can easily select them for a handy copy/paste operation to document the properties of this publication.
In the last wizard screen, click the Finish button to create the publication.

Figure 9.3: In the Create and Manage Publications dialog box, select a database.

Figure 9.4: Leaving the "No" radio button checked to answer the question, "Do you want to customize this publication further?" enables you to accept defaults for many properties of the publication.

To review and change a publication that's already been created, access the Enterprise Manager console tree, select the desired database, expand the tree, and select the Publications node.

Under Publications, select the desired publication, and then right-click and select Properties. Figure 9.5 shows the properties of a publication called Northwind Employees.

The tabs in this dialog box not only summarize all the attributes of the publication, but they contain options that enable you to control and change such attributes as the following:

Articles included in the publication (you can add more if needed).
Agent status for the SQL agents involved in the publication.
Subscribers to the publication.
Replication scripts for the publication.

Note that making changes in the publication Properties dialog box may require you to resynchronize or reinitialize the publication. In addition, you cannot make structural changes to the publication that remove columns or that filter rows if any subscriptions are already using the publication. The next section describes the agents that perform all the various tasks of replication.

Where's My Publication?

You may have to right-click and select Refresh if you just created a publication but the Publications node does not show up under the database.

Figure 9.5: The publication Properties dialog box shows all the properties of a publication in an easy-to-understand format.

Replication Agents

Four agents carry out all the tasks of SQL Server replication:

The Snapshot Agent
The Log Reader Agent
The Distribution Agent
The Merge Agent

These agents are specialized jobs of the new SQL Server Agent, which is described in full in Chapter 5, "The SQL Agent."

Each type of replication—snapshot, transactional, and merge—combines the efforts of one or more of these agents to actually carry out the replication. Each agent has a domain of specialization, as described in the following subsections.

Snapshot Agent

The Snapshot Agent prepares snapshots for initial synchronization or for snapshot replication. A snapshot is a point-in-time image of a publication's schema and data. Snapshots are used to initialize a publication so that the replication has a starting point where the destination is equal to the source. This initial sync step is performed for both transactional and merge replication. In snapshot replication, the total set of data is re-sent each time the replication is run, so the Snapshot Agent prepares a new snapshot for each replication cycle.

Log Reader Agent

The Log Reader Agent keeps an eye on published databases' transaction logs, looking for changes that affect published tables. As it finds transactions that affect these tables, the Log Reader Agent copies the transactions into the distribution database, where they'll be picked up later by the Distribution Agent to be applied to subscribers.

The Log Reader Agent checks the transaction log either periodically according to a schedule that was set when the publication was created or on an ongoing, continuous basis.

Distribution Agent

The Distribution Agent performs the actual movement of data: copying data and schemas for snapshot replication, applying transactional changes for transactional replications, and so on. The Distribution Agent executes on the distributor for push subscription and executes on the subscriber for pull subscriptions. Remember that the distributor is the designated server that houses the distribution database. The Distri-bution Agent looks at the distribution database for replication commands that are to be carried out.

Merge Agent

The Merge Agent, used exclusively in merge replication, has several jobs to do. For one, the Merge Agent is used to deliver the initial synchronizing snapshot when a merge publication is created. The Merge Agent merges updates and detects conflicts and also calls the resolver to work out conflicts that are detected when both the publisher and the subscriber update the same record. The Merge Agent executes on the distributor for push subscriptions and executes on the subscriber for pull subscriptions.

Now that you've covered the basic elements needed by all replication types, it's time to look at each type in detail.

Snapshot Replication

The simplest type of replication, snapshot replication, simply refreshes the subscriber on a scheduled basis. A refresh is the same as the initial synchronization that takes place with a new subscriber. Each time the snapshot replication runs, an exact copy of the publication's schema and data is created at the subscriber's databases. The complete set of schema and data to be applied at each subscriber is called a synchronization set. There is a synchronization set for each article in a snapshot publication. The synchronization set is created each time a snapshot publication executes, so that referential integrity between tables in the set is maintained.

The upcoming subsections further explore snapshot replication:

Applications for Snapshot Replication
How Snapshot Replication Works
Snapshot Replication Setup

This same basic type of coverage—and in the same basic format—is offered for each of the replication types. Being armed with specific knowledge about each replication type in detail enables you to better evaluate the solution available to any given replication scenario.

Applications for Snapshot Replication

Because snapshot replication is a complete refresh each time, it can be used as part of a low-cost warm backup setup, perhaps configured to snapshot during periods of low network activity, such as overnight or weekends. It's also ideal for applications that need to be periodically refreshed (such as monthly totals in a decision support environment, data-driven Web sites, and other occasionally connected subscribers). With the advent of embedded SQL Server applications on the desktop and mobile units, it's attractive to imagine using snapshot replication to reload these units with updated application code at connection time.

Snapshot replication serves the well-connected subscriber as well. With the Immediate Updating Subscriber option, updates can be made at the publisher and the subscriber simultaneously, and then later, the new snapshot is delivered, updates and all. Snapshot replication generally requires less processing overhead on the publisher and distributor because it's not a transaction-based continuous update operation.

In addition, snapshot replication is a viable way of replicating databases that are routinely populated by using nonlogged bcp or select into, when transactions are not present to drive the replication.

Schema Changes Go Undetected

Changes to schema that vary from the original schema and that occur after the publication is created are, unfortunately, not replicated to the subscribers. So if you add a column to a published table, for example, the new column won't show up at the subscriber. A workaround is to create another publication containing the added column, and then subscribe to it.

Snapshot Replication: How It Works

Each snapshot publication is assigned two agents:

Snapshot Agent
Distribution Agent

The Snapshot Agent prepares files that are needed to replicate the publication's schema and data, and the Distribution Agent carries out the replicating tasks. The Snapshot Agent creates schema files that represent the design of the tables in the publication, as well as data files that hold the actual records from the published articles.

The schema files are the SQL Data Definition Language (DDL) commands required to create the table in the destination database. The data files are created either in bcp format (.bcp files, see Chapter 8) if the subscriber is a SQL Server database, or in text format (.txt files) for subscribers using other database systems such as Oracle or Microsoft Access. If there is an index on any of the articles, an index file (.idx) is also created, which holds the command required to re-create the index.

The schema files and the data files are referred to as the synchronization set for each article in the publication. You can find these files in the directory: Mssql7\Repldata\_unc\ServerName_DatabaseName_PublicationName\DateTimeStamp\, where ServerName is the SQL Server machine name, DatabaseName is the database where the publication is located, PublicationName is the name of the publication, and DateTimeStamp is a 14-digit representation of the date and time in the format YYYYMMDDHHMMSS (see Figure 9.6).

Figure 9.6: The format of the directories created by the Snapshot Agent to hold synchronization sets; notice the naming convention that SQL Server uses when creating these directories.

After creating the synchronization sets, the Snapshot Agent notes where the synchronization sets are located and records that information in a table called MSrepl_Commands located in the distribution database. When a subscriber comes along to pull down the publication, or when the publication is pushed to a subscriber, the Distribution Agent looks at this table to find the schema, data, and commands needed to execute a snapshot successfully.

Snapshot Replication Setup

Setting up a snapshot replication is simple. Just follow the steps in the Create Publication Wizard, as outlined in the section "Setup Steps Common to All Types of Replication" (earlier in this chapter).

You must decide whether to allow Immediate Updating Subscribers when you're setting up the publication. There are two restrictions on using Immediate Updating Subscribers:

Only SQL Server 7 subscribers can be Immediate Updating Subscribers.
Subscribers should not update timestamp columns because doing so would interfere with the mechanism that detects conflicts between the subscriber and publisher.

The schedule that controls when the Snapshot Agent creates the synchronization set is usually set by default in the Create Publication Wizard; however, the schedule can be changed. To edit the schedule after the publication is in place, do the following:

In Enterprise Manager, choose Replication Monitor and select Agents.
Click on Snapshot Agents. A set of publications for which the Snapshot Agent is responsible appears in the right pane.
Right-click on the desired publication and select Agent Properties from the context menu.
In the Job Properties dialog box that appears, select the Schedules tab.
Edit the schedule as desired. Figure 9.7 shows all screens used in the sequence, with a schedule that can be edited.

The Directory Naming Convention

The UNC directory is used for LAN-connected, WAN-connected, or otherwise NetBIOS-connected subscribers. Internet subscribers reach the distributor strictly via FTP connection, so the ftp directory is used for those subscribers instead.

Internet-configured publications are described later in this chapter, in the section "Advanced Replication Topics."

Figure 9.7: You have a great deal of flexibility when changing an agent's schedule.

Note that the schedules for both the Snapshot Agent and the Distribution Agent can be tailored. The Snapshot Agent's schedule refers to when the synchronization set is created and refreshed, not when the publication actually is sent out. The Distribution Agent's schedule becomes important only when someone subscribes to the publication, because the type of subscription determines when the Distribution Agent executes.

The Distribution Agent assigned to the publication starts at the scheduled time and reads the commands the Snapshot Agent created. The Distribution Agent then connects to the subscriber and moves the data and schema commands to the subscriber.

Because these are snapshot publications, a DROP command is executed first, before the tables are re-created on the subscriber.

Transactional Replication

This is the "classic" style of replication introduced in SQL Server 6.0 and enhanced in SQL Server 6.5. Transactional replication is log-based; commands affecting published tables are saved and then executed at the subscriber by the Distribution Agent at replication time, either according to a schedule or after a number of transactions have occurred.

Applications for Transactional Replication

Transactional replication is an effective means of getting near real-time updates to subscribers in a well-connected environment; it's been designed to apply updates to subscribers within a minute or so after they occur in the source database, with guaranteed consistency. Transactional replication is also useful in the occasionally connected environment because only the changes to published tables are replicated to the subscribers (instead of the complete refresh that's sent with Snapshot replication), which means reduced network traffic. If 160 rows were inserted in a two-gigabyte published table today, 160 INSERT commands would be replicated and executed at the subscriber when the replication takes place.

Transactional Replication: How It Works

The following agents handle each transactional publication:

Snapshot Agent
Log Reader Agent
Distribution Agent

When a transaction publication is created, the Snapshot Agent performs its task of creating a synchronization set (exactly the same as it does in snapshot replication). This is used to initialize new subscribers to a state consistent with the publisher before the incremental changes can be sent out.

Transaction-based replication works by detecting changes in published tables and forwarding those changes to subscribers in the form of SQL transactions (insert, update, and delete statements).

Transactional replication uses the Log Reader Agent in addition to the Snapshot Agent. The Log Reader Agent is responsible for checking the transaction log of the database where the published table resides. This is the very same transaction log that's used in the general processing of database changes that have nothing to do with replication.

The Log Reader Agent watches the transaction log, looking for insert, update, or delete statements that affect tables marked for replication. The Log Reader Agent copies any such statements into the distribution database, where they remain until the Distribution Agent applies them to the subscriber database in exactly the same sequence in which they were executed at the source database.

The Distribution Agent runs on the publisher (for push subscriptions) and the subscriber (for pull subscriptions).

The Distribution Agent can be run either on a fixed schedule or continuously.

Transactional Replication Setup

Follow the steps in the section "Setup Steps Common to All Types of Replication," by using the Create Publication Wizard. One item of interest for transactional publications is that the wizard offers to publish the execution of stored procedures as part of the publication's articles. In the Specify Articles step, you can select a check box to include stored procedures in the article. This causes a dialog box to appear, in which you can tailor the replication of the execution of the stored procedure (see Figure 9.8).

Figure 9.8: Replicating the execution of a stored procedure in a transactional publication is just as easy as adding tables to the publication. (Notice the pencil and pad icon that represents stored procedures.)

The Immediate Updating Subscribers Option of Snapshot and Transactional Replication

SQL Server 7 introduces a way to enable subscribers of snapshot and transactional publications to update the very articles that come to them by way of replication: an option called Immediate Updating Subscribers. This updating occurs at the subscriber end, so there's no need to connect over the WAN to the central site when a business branch desires to make a change to replicated data.

Records that are updated at the subscriber end in a publication that has Immediate Updating Subscribers enabled are also updated at the source using a Two-Phase Commit (2PC) transaction that occurs completely behind the scenes and is managed by the Microsoft Distributed Transaction Coordinator (MSDTC). The update is guaranteed to either succeed at both the subscriber and the publisher or fail altogether.

For snapshot replication, when the next snapshot is replicated to all the subscribers, the updated record is included in the snapshot. For transactional replication, the record updated by the subscriber is identified as a replicated transaction to be distributed to all the subscribers of the affected publication, with the exception of the subscriber doing the updating. SQL Server uses a loopback mechanism to identify the server making the update so that it can be excluded from receiving the transaction again.

There is one potential problem with Immediate Updating Subscribers: trying to update or delete a record that no longer exists at the publisher. Triggers at the publisher detect this and cause the transaction to fail. This situation is handled elegantly, however, by merge replication, which is described in the next section.

Merge Replication

The newest addition to the set of replication types in SQL Server 7 is merge replication. Borrowing concepts from JET database replication introduced with Access 95 and enhanced with JET 3.5 and Access 97, merge replication works to resolve conflicts among updates to the same record by both the publisher and one or more subscribers so that they end up with the same value.

When the same record is updated at both the publisher and subscriber table, either the publisher's record is updated at the subscriber, or the subscriber's record is updated at the publisher, depending on how the conflict was resolved. Only the last state of a changed record is considered at replication time; even if a particular column in a particular record was changed 100 times between replications, only the last value is involved in the merge.

Information about resolved conflicts is kept so that the conflict history can be reviewed later. SQL Server 7 includes a simple conflict resolution algorithm, but you can augment or replace this with your own resolver.

Applications for Merge Replication

Merge replication offers a great deal of autonomy to subscribers. Because updates flow in both directions at merge time, the benefit of updating subscribers is realized without the strict atomicity of a transaction-based subscriber update (which is required by the Immediate Updating Subscriber option of snapshot or transactional replication).

Another difference from the Immediate Updating Subscriber method is that with merge replication, all changes are somewhat negotiable. In cases in which a conflict would normally cause a rollback, a resolver is called instead to mediate. This can make administrating distributed updatable replicas an easy task, because everything is automated.

The conflicts you've been reading about are, by default, column-based. A conflict occurs when the data in the same column and the same row is updated in two (or more) different tables. This means that updating other columns in the same row in two different places does not cause a conflict; instead the information is simply merged together.

For example, a large travel agency with many office locations likes to keep customer profiles online so the customer's travel and hotel preferences are readily available. Suppose a business traveler far from home walks into a local branch to make a reservation, and she changes her airplane seating preference from "aisle" to "window" after thinking about the boring flight earlier that day. The branch office updates and saves the customer's record. At the same time, the traveler's administrative assistant back home speaks to the local travel agency branch office to update the traveler's email address, and that change is also applied and saved. The Merge Agent runs at each location, and every branch ends up with an updated record that includes both the new seating preference and the new email address. The two updates were not considered to be a conflict, because they weren't updating the same record/column (field). Therefore, the result is a merge.

Note that this is the default behavior of a merge replication; you can turn column-based conflict off so that anytime two servers update more than one column in a row, it is considered a conflict.

Merge Replication: How It Works

The Merge Agent takes the initial snapshot prepared by the Snapshot Agent, applies it to subscribers, and then applies incremental changes to either the source or the destination, based on the resolution of conflicts when both source and destination records are updated. To help resolve conflicts, several schema changes are made to each table as it is prepared to participate in a merge publication:

Each row must be assigned a globally unique value by adding or using a new column of the SQL type uniqueidentifier. If the row already has a uniqueidentifier column with the ROWGUIDCOL attribute, it's used to uniquely identify the record among all networked computers in the world (see the upcoming sidebar "Unique Records in the Universe"). If SQL Server adds a column to contain the globally unique column, it names the column rowguid.
Tracking triggers are added to detect changes to each row, column, or both row and column of the table that's being added to a merge publication. These Transact-SQL triggers capture the changes being made and record the change in merge tables, which are also added when the publication is created.

Figure 9.9 shows the tables that are added to a database after a merge publication is created. In this example, a publication was created in the SQL Server pubs sample database, and a single article was created that includes only the authors table.

After the Merge Agent resolves conflicts involving two rows in which the same column was updated by both publisher and subscriber, one of the following results takes place:

The winning rows are saved in the actual table (on either the publisher or the subscriber).
The losing rows at the publisher are saved in a table called Conflict_<TableName>, where <Table_Name> is the actual table name, for example, conflict_authors.
Any rows that were deleted are saved in a table called Msmerge_delete_conflicts at the publisher.
Any inserted rows are inserted in both the source and destination.

Figure 9.9: Tables with the names MSxxxx that are added to the pubs database after defining a merge publication are used for the various tracking activities of merge replication.

Unique Records in the Universe

Armed with the new uniqueidentifier data type and the column attribute ROWGUIDCOL, you're set to create as many globally unique values in a table as desired:

CREATE TABLE GUID_bucket
(guid uniqueidentifier ROWGUIDCOL,
ITEM int)

To add some records to this little GUID_bucket, execute the following INSERT several times. Note the use of the NEWID() function to generate the value for the guid column:

insert into GUID_bucket values(newid(), 100)
select * from guid_bucket
guid	 ITEM    
––––––––––––––––––––––––––––––––––––     ––––––––––
06FD8E8E-3E82-11D2-A68A-788540000000	 100
06FD8E8F-3E82-11D2-A68A-788540000000	 100
06FD8E90-3E82-11D2-A68A-788540000000	 100

The little column called guid in the preceding example is added to a table that's being prepared for merge replication.

Merge Replication Setup

Follow the steps in the section "Setup Steps Common to All Types of Replication," by using the Create Publication Wizard. The setup for a merge publication has no surprises; it's very similar to setting up any other type of publication.

When a merge publication is in place, you can view and change some attributes of the publication and even the individual articles by going to the publishing database in Enterprise Manager and selecting the Publications tree node. Right-click on the desired publication and select Properties. Then go to the Articles tab to see which articles were included in this publication. Select the build button (…) to the right of an article, and another tabbed dialog box shows the options for that particular article (see Figure 9.10). The three tabs contain the various options for creating and maintaining the article. Figure 9.10 shows the General tab, which offers options for overriding the default behavior that specifies what causes a conflict; you could check the radio button labeled Consider Changes Made to the Same Row As a Conflict, for instance, to cause an update to any column in two of the same rows to be considered a conflict that needs to be resolved.

You can also use the Resolver tab to specify a SQL-stored procedure to be called upon to handle conflict resolution. SQL Server will use your stored procedure to resolve conflicts that occur in updates to that particular article only. This stored procedure executes at the publisher. The parameters to the stored procedure include the following:

@tablename. Name of the table for which a conflict is being resolved.

@rowguid. uniqueidentifier that identifies the row having the conflict.

Figure 9.10 You can view the article options, install an overriding conflict resolver, specify what constitutes a conflict, and more.

@source. Name of the server from which a conflicting change is being propagated.

@source_db. Name of the database from which a conflicting change is being propagated.

@source_security_mode. Security mode for connecting to a source server.

@source_login. Login name used when connecting to the source server.

@source_password. Password used when connecting to the source server.

@dest. Name of the server to which changes are propagated.

@dest_db. Name of the database to which changes are propagated.

@dest_security_mode. Security mode for connecting to the destination server.

@dest_login. Login name to use in connecting to the destination server.

@dest_password. Password to use in connecting to the destination server.

Remember that, by default, a conflict indicates that the same field (row/column) is being updated at two servers. The resolver stored procedure should look at the record identified by @rowguid in the table at both servers, decide what to do, and then return a single row to be put into place at both servers.

You determine what happens to resolve the situation. For example, you might merge the columns from both records into one composite record, or add numeric columns together to overlay binary fields in logical operation, or whatever is desired.

The stored procedure is created either in the publisher database or in the master database that's marked as a system object so that it can be found.

If you have access to COM expertise, you can use an in-process DLL COM module that implements the ICustomResolver interface as a custom resolver. SQL Server 7 ships with an example of this in the directory Mssql7/Devtools/Samples/sqlrepl/resolver. To select a COM resolver, select the resolver from the list of registered resolvers that appears under Use This Custom Resolver (refer to Figure 9.10).

You have now covered all the replication types. In the following sections, you'll examine replication security issues and learn how to go about subscribing to publications, monitoring replication, and more.

Replication and the New SQL Server Security Roles

SQL Server 7 replication has several built-in SQL Server security roles. You must be a member of one of these roles in order to perform the specific activities of replication setup. Most of the tasks require the server role sysadmin, but some tasks are database-specific and can be performed with the db_owner role. Table 9.1 lists replication tasks and the corresponding role or roles required.

Table 9.1 Required Roles for Replication Tasks

Task	Role/Roles Required
Configuring distribution	sysadmin
Enabling databases for replication	sysadmin
Configuring publishing	sysadmin
Creating publications	sysadmin or db_owner
Creating push subscriptions	sysadmin or db_owner
Creating pull subscriptions	Any login in the publication's Publication Access List
Dropping subscriptions or creating login	sysadmin or db_owner
Monitoring replication	sysadmin
Updating default publication access list	sysadmin
Updating a custom publication access list	sysadmin or db_owner
Agent logging in to publisher	Login must be in the Publication Access List of the referenced publication
Agent logging in to distributor	Login must be in the Publication Access List of the referenced publication
Log Reader Agent logging in to publisher	db_owner
Agents logging in to subscriber	db_owner

SQL Server replication adds another component to security: the Publication Access List (PAL). This is similar to the Windows NT Access Control List (ACL).

When the server is first configured for replication, a default PAL is established, which is the default security mechanism that limits access to pull and updatable subscriptions. You can create a custom PAL to be specified to further control a particular publication.

To view and change the default PAL for a server, follow these steps:

Choose Tools, Replication and select Configure Publishing Subscribers and Distribution….
Select the Publication Access List tab to add and remove users.

To view and change the custom PAL for a publication, follow these steps:

In the database that contains the publication you want to control, right-click the desired publication and select Properties from the context menu.
Select the Publication Access List tab.
Click the radio button labeled Use the Following Custom Access Control List for This Publication.
The list of users will be initialized with the list from the server's default PAL. Add or remove users from this list as desired.

Note that an anonymous subscriber can connect to a publication without being checked against either PAL.

Subscribing

As in SQL Server 6.5, there are two types of subscriptions: push and pull. A publication can be both pushed to some subscribers and pulled from other subscribers.

One change from SQL Server 6.5 to 7.0 is that now subscribers are only supposed to receive publications, and not the individual articles of publications. As the SQL Server Books Online states, the user interface (Enterprise Manager) does not support subscribing to individual articles, although that action is still supported by SQL Server 7. Microsoft suggests that you migrate your subscriptions to the publication level, where each publication is composed of one or more articles.

Except in the case of publications that are enabled to accept anonymous subscribers, only enabled subscribers are permitted to subscribe. Subscribers are first added when the server is configured for publication. But you can enable additional subscribers later. To do so, follow these steps:

In Enterprise Manager, select the publishing server to add subscribers to.
Select Tools, Replication, Configure Publishing, Subscribers, and Distribution.
In the dialog box that appears, select the Subscribers tab and click New Subscriber….
Select SQL Server or another type of subscriber (see Figure 9.11).

Figure 9.11: From this dialog box, you can add new SQL Server, ODBC, or OLE-DB-linked servers as subscribers.
Note that by using the build (…) button next to the subscriber's name on the Subscribers tab, you can access another dialog box to set several options regarding how to deal with that particular subscriber (options such as login attributes for replication agents connecting to the subscriber, default schedules, and so on).

Each type of subscriber has its own distinct advantages. Publications can be pushed to or pulled from subscribers. In addition, a publication can be configured to be pulled from known or anonymous subscribers. The next sections describe these concepts.

Push Subscriptions

A push subscription is created at the publisher and pushed to the subscribers (instead of leaving the publication available to be selected by a subscriber creating a pull subscription).

You create a push subscription by using the Push Subscription Wizard. To locate the wizard, right-click on the desired publication under the Publications node of a database name in SQL Enterprise Manager.

The value of push subscriptions is ease of administration: You can set up the publication and push it to the desired subscribers—even a large number of subscribers—all in one step. In fact, by using the new Replication Scripting facility in SQL Server 7, you can save the whole scenario in a script file to be used to create publications and their subscribers. All you need to re-create the configuration is an ISQL command line. Packaged software products that use replication might also want to use this feature in their install procedures.

Pushing a subscription is the easiest way to publish to heterogeneous data sources such as Oracle, Microsoft Access, and other ODBC destinations.

Pull Subscriptions

The subscriber initiates a pull subscription when the subscriber is ready to receive updates. Pull subscriptions are well-suited for large numbers of subscribers because the subscriber creates and administers the replication instead of the publisher. Subscribers using the Internet, for instance, would pull subscriptions from a publisher.

With pull subscriptions, the work of designing and creating publications can be done at the publisher ahead of time, before any subscribers need to use the publication. When a publication is created, a synchronization set is also created so that the publication is ready to initialize any new subscribers that come along.

A big advantage of using pull subscriptions is that the agents that perform the replication tasks run at the subscriber, not at the publisher. This removes the processing from the publisher and allows the subscriber to set the schedule (which is another reason to use pull subscriptions when the number of subscribers is large).

A subscriber creates a pull subscription by using the Pull Subscription Wizard. Only SQL Server subscribers are able to use this wizard because it's a component of SQL Server Enterprise Manager. There is a way, however, to perform a pull subscription from a non-SQL Server subscriber by using the embedded Distribution control.

Anonymous Subscriptions

An anonymous subscriber is a pull subscription from a subscriber that has not been specifically enabled to receive publications from this publisher. You make the decision of whether to use anonymous subscribers when you first create the publication.

Anonymous subscribers are well-suited for publications that are available to be pulled from Internet subscribers because there could never be any way to know in advance all the subscribers that would need to be enabled.

When creating a publication by using the Create Publication Wizard, you have the opportunity to enable anonymous subscribers. You can't add or enable subscribers in that wizard, however; the only choice is whether or not to allow anonymous subscribers (see Figure 9.12). To add subscribers, you must use the Configure Publishing, Subscribers, and Distribution Wizard. In addition, you can modify existing publications to allow anonymous subscribers by using the publication's Properties dialog box.

Monitoring Replication

One of the biggest advantages that SQL Server 7 has over previous releases is that it makes it much easier to get a detailed status of exactly what is happening, what was happening, and even what will happen with regard to replication.

Much of this is due to the use of SQL Server Agent jobs to manage replication activities. As described in Chapter 5, "The SQL Agent," the SQL Server Agent is industrial strength as compared to the SQL Executive of previous releases. The SQL Server Agent is used as the basis upon which the replication agents are built.

Figure 9.12: Subscribers are either anonymous or not, which must be specified at the time the publication is created; a publication can be made anonymous later via the publication's Properties dialog box.

There are several places to check on the progress and history of a publication, and preconfigured SQL Server Agent alerts let the server notify you when there is a problem.

Monitoring Replication Agents

In Enterprise Manager, the Replication Monitor is available if the server is enabled as a distributor and the current user is a member of the sysadmin fixed server role. By opening the Replication Monitor, you can view publishers, agents, and replication alerts.

Under Publishers, you'll find a list of publishers that use this server as a distributor, along with a list of all their publications. Select a publication here to see a list of all the agents responsible for carrying out the replication tasks for that publication—usually the Snapshot Agent is listed first, followed by the agents involved in the publication: Log Reader Agent, Distribution Agent, or Merge Agent, as shown in Figure 9.13.

If instead of Publishers you select Agents under Replication Monitor, the four agents of replication are displayed. Select one of the agents, and a list of the publications handled by that agent appears in the right pane of Enterprise Manager. Figure 9.14 shows all of this in one screen.

Figure 9.13: The right pane shows the agents responsible for carrying out the steps of the publication called pubs_Authors; the left pane shows the agents' current status for the selected publication.

Figure 9.14: The right pane shows the publications for which the agent currently selected in the left pane is responsible.

Select a particular publication in the right pane, and right-click to reveal Agent History and Agent Properties. Use the Agent History option to show the summary results of every execution for that specific publication and, by using the Session Details button, every step of an execution for the particular agent servicing the particular publication. Figure 9.15 shows the steps of one particular execution of the Merge Agent for a publication called pub_Authors.

Another great place to get information about replication agents is under the SQL Server Agent node. Select the Jobs section of the console tree. The right pane displays a list of SQL Server Agent jobs, many of which are related to replication tasks. Those with a category name that starts with 'REPL' are replication jobs associated with the replication agents.

To refresh the jobs shown in the right pane, right-click on Jobs and select Refresh. Right-click on a selected job and select Job History for a detailed account of when the job ran, the results of the job, and optional details of the job showing every step. Figure 9.16 shows the job history for a Snapshot Agent.

Replication Alerts

SQL Server 7 provides a few alerts that can be used to route replication-related problems to the appropriate people so that manual intervention can be used to correct the problem. As described in Chapter 5, the alert facility provided by SQL Server Agent is a powerful component that you can use in many creative ways.

Figure 9.15: You can examine each step of one particular execution of the Merge Agent in this session detail window.

Figure 9.16: All replication agents appear under SQL Server Agent/Jobs. Select an agent to get a detailed job history.

The alerts that are preconfigured with SQL Server 7 include some of the more catastrophic error conditions. These alerts are SQL Server event alerts, which means that they are invoked upon occurrence of a particular error condition with a specific error code.

It's easy to add your own alerts to the Replication Alerts section of Replication Monitor. Just right-click on Replication Alerts and select New Alert. Fill out the dialog box that appears, describing your new alert. Select the error number you want the alert to react to and specify the actions of the alert.

Troubleshooting

It's not too difficult to spot a critical problem that affects a replication component. The agents that handle replication show problem status visually in Enterprise Manager (see Figure 9.17).

To locate the problem, look to the agents for answers. From the Replication Monitor, first locate any error indicators (the big white X in a red ball), and then expand the console tree.

Select Publishers to view a list of all publications handled by this distributor; failing publications are marked with the error indicator. Select a failing publication by clicking on it once. The right pane displays all the agents involved with the particular publication, and the failing agent is highlighted with the error indicator. In Figure 9.18, for instance, you can see that there is a problem with the Snapshot Agent for the publication Northwind_Region in the Northwind database.

Figure 9.17: It's easy to spot trouble, as you can see in this visual display of an error reported in Enterprise Manager.

Right-click on the publication to access a context menu that conveniently has a selection at the top labeled Error Details. Click on that option to see what error caused the replication to fail.

You can also analyze failed replication jobs by looking at the SQL Server Agent's log (because the replication agents are job instances of SQL Server Agent). In SQL Server 7, the SQL Server Agent has its own log that is separate from the SQL Server error log. In Enterprise Manager, go to SQL Server Agent on the main console tree, right-click, and select Display Error Log… to analyze the current SQL Server Agent error log. The current log is located in the directory ..\Mssql7\Log\ and is named SQLAGENT.OUT; history files in the same directory are named SQLAGENT.<x>, where <x> is a number from 1 to 9, with 1 being the most recent and 9 the oldest. You can view these history files with any text editor.

To augment the information in the SQL Agent logs, right-click the SQL Server Agent and select Properties to display the SQL Server Agent properties dialog box. Then check the box labeled Include Execution Trace Messages in the Error Log section (see Figure 9.19).

Looking at all of the various logs is the best way to get a quick ballpark idea of what the problem is.

Figure 9.18: Replication Monitor takes you right to the problem area. Select Error Details from the context menu to see what error caused the replication to fail.

Figure 9.19: Use the Include Execution Trace Messages check box to include execution trace messages for SQL Server Agent.

Resolving Conflicts with Merge Replication

SQL Server 7 Books Online has some good advice for working with database tables that use IDENTITY property and need to be replicated by using merge replication. The advice is to use the NOT FOR REPLICATION attribute of the table's IDENTITY column and then use different IDENTITY ranges for publishers and subscribers if at all possible.

IDENTITY columns are table columns that are automatically populated with the next incremental value each time a record is inserted. There is no need to increment the identity, however, when the process doing the inserting turns out to be a Distribution or Merge Agent, which is what the NOT FOR REPLICATION attribute indicates. Check SQL Server 7 Books Online carefully for cautions about using IDENTITY columns with merge replication and for details about how to avoid conflicts.

When conflicts occur, they must be resolved by using either the default conflict resolver or one that you provide. See the section "Merge Replication Setup," earlier in this chapter, for more information on setting up a custom resolver.

Advanced Replication Topics

In this section, you'll examine features of SQL Server replication that aren't particularly advanced by themselves; rather, these features are advanced when compared to what's been available in past versions of SQL Server as far as exploiting the workings of replication are concerned. The goal of these features is to make the lives of database administrators easier by providing the facilities to solve more problems.

Configuring Replications for the Internet

Making a snapshot, transaction, or merge publication available to subscribers connecting over the Internet or a corporate intranet is as simple as marking a check box in a publication's Properties dialog box when you're creating the publication. Both push and pull subscriptions can be enabled to use the Internet in this way.

For a snapshot publication, the subscriber connects to an FTP server at the publisher to retrieve the synchronization sets that initialize the publication. After that, the Merge Agent or Distribution Agent connects over the Internet to deliver updates at replication time.

Scripting the Replication Topology

After you have configured replication, it is often convenient to save the configuration so that everything can be easily set up again if a disaster strikes. The best way to do this is to script the replication topology and save the scripts. To create scripts, follow these steps:

In Enterprise Manager, select a server.
Choose Tools, Replication and select Generate Replication Scripts.
The General tab of the dialog box that appears offers two check boxes: Distributor Properties and Publications in These Selected Databases…. To script this server's replication setup in its entirety, check both boxes (see Figure 9.20).

Figure 9.20: To script a replication configuration, select the publications to script and check the Distributor Properties check box.

Pay particular attention to the options under Script the Commands To, which are used to script either the commands to create the publication or the commands to drop the items being scripted. In most cases, scripting the commands that create things is the appropriate choice, although there might be a reason to script in order to destroy a publication or disable a server for replication. In fact, in previous releases of Microsoft SQL Server, disabling and dismantling replication was not an easy feat to achieve without referencing the Microsoft Support Knowledge Base article that described how to do it.

Replication ActiveX Controls

Microsoft SQL Server 7 includes two ActiveX controls: the SQL Distribution control and the SQL Merge control. These controls can be used from any environment that's able to use ActiveX controls—such as Visual Basic, Microsoft Office, and so on.

Because they're not visual controls, scripting environments, such as Active Server Pages and Windows Script Host, can use the Replication controls, too.

In general, there are three methods available in each control:

The Initialize method
The Start method
The Terminate method

Normally, only Initialize and Start are needed, so this makes a very clean interface for a complex capability.

To use the controls, set all the required parameters by setting properties of an instance of the control in your application, and then call Initialize. If no errors result, call Start, and the agent will start, connect, and perform replication.

The control raises one event, the Status event, which periodically fires to report replication completion status. To receive the event, your application must declare the instance of the Replication ActiveX control by using the WithEvents attribute.

Several properties can be set in order to connect to publishers and subscribers using specific network protocols and specific security. When using these controls, an application has total control over how replication instances are handled.

A sample Visual Basic application ships with SQL Server 7. That application is located in the directory\Mssql7\Devtools\Samples\ Sqlrepl\Replctrl\Vb.

An ideal use for the SQL Distribution control is to serve as a distributor in an application that needs to act like a pull subscriber. Usually, the subscription must be pushed when publishing from SQL Server 7 to heterogeneous data sources such as Microsoft Access, because a Microsoft Access database doesn't have the "smarts" to perform a pull operation. The SQL Distribution control can provide these "smarts," eliminating the administrative work needed to create push subscriptions to these heterogeneous desktop databases.

Replication to and from Heterogeneous Data Sources

Microsoft SQL Server 6.5 replication introduced the capability to publish to heterogeneous data sources such as Oracle, Microsoft Access, and other ODBC destinations. This was accomplished by using a push subscription and ODBC to push to these data sources. It was an interesting novelty that never gained much popularity. In SQL Server 7, this feature has been enhanced greatly, enabling you to do the following:

Publish to ODBC/OLE DB data sources
Use Data Transformation Services (DTS) packages as a source of replication
Use other data sources as the publishing source

The next sections examine each of these capabilities.

Publish to ODBC/OLE DB Data Sources

Publishing to heterogeneous data sources is a significant capability that greatly extends the range of subscribers in a multivendor shop having databases such as Microsoft Access, Oracle, and even DRDA sources (including AS/400 and DB2 Universal Database). As long as you use the ODBC drivers specifically provided by Microsoft for this support, replicating to these destinations is fully supported and encouraged in SQL Server 7.

Publishing to heterogeneous data sources is typically accomplished via a push subscription, unless the destination is an application that initiates a pull subscription using the SQL Distribution control.

The following steps outline the process for publishing to heterogeneous subscribers:

Create an ODBC data source by using the Control Panel at the publishing source. When creating the ODBC data source, make it a System DSN.
Create a new subscriber to push the subscription to. Choose Tools, Replication, and then select Configuring Publishing, Subscribers, and Distribution. Select the Subscribers tab and click on the New Subscriber… button. A set of selections appears, one of which is ODBC Data Source (see Figure 9.21). Select the data source you just added.
Create a publication by using the Create Publication Wizard and enable the publication to replicate to non-SQL Server subscribers (see Figure 9.22). This ensures that the initializing sync set used in the initial snapshot will be in a format that can be used by non-SQL Server subscribers (usually character-mode format).
Push the new publication to the ODBC subscriber. Select the published database, expand the Publications node, and select the publication to be pushed to the new ODBC subscriber you just created. Right-click the publication and select Push New Subscription to start up the Push Subscription Wizard.

Figure 9.21: From the Publisher and Distributor Properties dialog box, select Subscribers to add an ODBC subscriber.
The wizard will ask you to select a subscriber from a list of enabled subscribers. Select the new ODBC data source subscriber and set the schedule for initial synchronization and replication.

Use Data Transformation Services (DTS) Packages As a Source of Replication

You can easily import data to SQL Server 7 from a variety of sources by using the new Data Transformation Services (DTS). This feature is described in greater detail in Chapter 8.

Imports created by using DTS can be used as a source of data in a snapshot publication. As an application of this capability, imagine running a nightly refresh of a SQL Server 7 data warehouse that's fed from an OS/390 DB2 system. Having this level of integration included in the box and ready to set up without a lot of development and maintenance is quite a premium.

Figure 9.22: Enable non-SQL Server subscribers when you create the publication.

To use a DTS package to perform a snapshot-style refresh, perform the following steps:

Start the DTS Import Wizard by choosing Tools, Data Transformation Services and selecting Import into SQL.
Go through the steps in DTS to select the data source, the SQL Server destination database where the snapshot will be loaded, the transformations to be applied, and so on.

At the step in the wizard that's titled "Save, Schedule, and Replicate Package," check off the options by filling in the check boxes as outlined here:

Option	Response
Run Immediately	Optional.
Create Replication Publication	Check it.
Save Package on SQL Server	Check it.
Schedule DTS Package for Later Execution	Optional. If you check it, use the build button (…) to set the schedule.

Save the DTS package with a useful name and description.
Complete the wizard by clicking Finish. The package will then run if you checked Run Immediately.

Use Other Data Sources As the Publishing Source

SQL Server 7 offers something completely new that's bound to attract the attention of other vendors seeking to integrate into the Microsoft SQL Server 7 replication environment: the ability to create publications and use the distribution capabilities of SQL Server 7 from another database system. This new capability is made possible by two services from SQL Server 7:

SQL-DMO (Distributed Management Objects), which performs the administrative chores of setting up the publisher, publication, and subscribers.
new Replication Distributor Interface, which stores replications in a SQL Server 7 distribution database where they are picked up and distributed for replication.

The following steps outline the process for using other data sources:

Create a Visual Basic program to use SQL-DMO to perform the administrative work of setting up a publication.
Use the Replication Distribution Interface, a COM interface, to insert replicated transactions into the distribution database.

SQL Server Books Online has examples for both of these tasks; the Replication Distribution Interface example is in C++.

Both of these are nontrivial projects that really are not suitable for most database administrators to tackle, but hopefully some interesting products on the horizon will dovetail into the SQL Server replication environment.

Conclusions

Microsoft has stated that they want to take a leadership role in database replication, and this is certainly reflected in SQL Server 7, both now and in plans for future releases. The next release of SQL Server 7, for example, will be able to interface with the Microsoft Message Queue Manager so that replication can continue even when servers are not attached. Replicated transactions will simply be queued until they can be delivered.

With a solid replication technology in place both now and in the future, system administrators are released from the juggling act of performing manual data duplication and are free to spend time in other ways.

About the Author

John Lynn has 15 years of experience in data processing and system analysis, is certified as an MCSD (Microsoft Certified Solution Developer), and has written many magazine articles about Microsoft application development and BackOffice technology (such as Visual Basic and SQL Server).

We at Microsoft Corporation hope that the information in this work is valuable to you. Your use of the information contained in this work, however, is at your sole risk. All information in this work is provided "as -is", without any warranty, whether express or implied, of its accuracy, completeness, fitness for a particular purpose, title or non-infringement, and none of the third-party products or information mentioned in the work are authored, recommended, supported or guaranteed by Microsoft Corporation. Microsoft Corporation shall not be liable for any damages you may sustain by using this information, whether direct, indirect, special, incidental or consequential, even if it has been advised of the possibility of such damages. All prices for products mentioned in this document are subject to change without notice.

International rights = English only.

Click to order

Replication

On This Page

Top Features

Wizards Make It Easier

New Replication Types

Update Replication

Anonymous Subscribers

Publishing the Execution of Transact-SQL Stored Procedures

Scripting the Replication Topology

Exploiting New SQL Server Agents for Replication Tasks

Better Support for Heterogeneous Replication

Replication ActiveX Controls

Setup and Configuration

The Configure Publishing and Distribution Wizard

Setup Steps Common to All Types of Replication

Revealing the Distribution Database

Where's My Publication?

Replication Agents

Snapshot Agent

Log Reader Agent

Distribution Agent

Merge Agent

Snapshot Replication

Applications for Snapshot Replication

Schema Changes Go Undetected

Snapshot Replication: How It Works

Snapshot Replication Setup

The Directory Naming Convention

Transactional Replication

Applications for Transactional Replication

Transactional Replication: How It Works

Transactional Replication Setup

The Immediate Updating Subscribers Option of Snapshot and Transactional Replication

Merge Replication

Applications for Merge Replication

Merge Replication: How It Works

Unique Records in the Universe

Merge Replication Setup

Replication and the New SQL Server Security Roles

Subscribing

Push Subscriptions

Pull Subscriptions

Anonymous Subscriptions

Monitoring Replication

Monitoring Replication Agents

Replication Alerts

Troubleshooting

Resolving Conflicts with Merge Replication

Advanced Replication Topics

Configuring Replications for the Internet

Scripting the Replication Topology

Replication ActiveX Controls

Replication to and from Heterogeneous Data Sources

Conclusions

About the Author

Additional resources