Understanding MS DTC Resources in Windows Server 2003 Failover Clusters

Applies To: Windows Server 2008

For Windows Server 2003 and Windows Server 2008, failover clusters handle recovery of the Microsoft Distributed Transaction Coordinator (MS DTC) when a node in the cluster fails for any reason. The two mechanisms it uses to perform recovery are failover and failback.

How failover works

When the node that is designated as the owner of the MS DTC resource fails, the Cluster service automatically designates another node in the cluster as owner of the resource. Then, the MS DTC restarts automatically on another node in the cluster as the transaction manager for the cluster. During this operation, the MS DTC resource and its dependencies are moved automatically to the new owner. This automatic process of moving the MS DTC resource and its dependencies to another node in the cluster is called failover.

The newly restarted MS DTC reads the MS DTC log file on the shared disk to determine the outcome of pending transactions and recently completed transactions. Resource managers reconnect to the MS DTC, and then they perform recovery to determine the outcome of in-doubt transactions. Applications reconnect to the MS DTC so that they can initiate new transactions.

For example, in the following illustration, the DTC that is acting as the transaction manager is active on System B. The application and resource manager on System A call the DTC proxy. The DTC proxy on System A forwards all MS DTC calls to the DTC on System B.

If System B fails, the DTC on System A takes over. It reads the DTC log file on the shared disk, performs recovery, and then serves as the transaction manager for the entire cluster.

With regard to choosing an alternate node for failover, Windows Server 2003 and Windows Server 2008 are configured by default to adhere to a defined failover policy. If you use the default policy, the selection of the failover node is determined randomly. If you want more control over which node is chosen for failover, you can use Cluster Administrator to alter the policy.

How failback works

When a failed MS DTC owner in the cluster restarts, you might want the MS DTC resource and its dependency resources to be returned to the preferred node, which is usually the node on which they resided before the failover. Failback is the process of moving an MS DTC resource and its dependencies to the preferred node.

By default, the MS DTC resource and its dependencies remain on the node to which they were moved during failover. However, you can use Cluster Administrator to configure Windows Server 2003 and Windows Server 2008 to perform failback automatically.