Writing MS Cluster Server (MSCS) Resource Dynamic-Link Libraries (DLLs)

Article
02/20/2014

Archived content. No warranty is made as to technical accuracy. Content may contain URLs that were valid when originally published, but now link to sites or pages that no longer exist.

A Windows NT 4.0 White Paper

Abstract

Microsoft® Cluster Server (MSCS) allows multiple Microsoft Windows NT® operating system-based servers to be connected together, making them appear to network clients as a single, highly available system. This paper provides a high-level overview of the processes involved in writing well-behaved cluster applications for MSCS. This document also describes how application and service developers can take full advantage of MSCS by writing resource dynamic-link libraries (DLLs), debugging their applications, and installing their applications and services in a cluster environment.

Introduction

Microsoft® Cluster Server (MSCS) allows multiple Microsoft Windows NT® operating system-based servers to be connected together, making them appear to network clients as a single, highly available system. From the system administrator's viewpoint, MSCS provides the additional advantage of easy administration and scalability, and the MSCS architecture provides a standard infrastructure for scalable, cluster-aware applications in future versions.

The purpose of this document is to provide a high-level overview of the processes involved in writing well-behaved applications that can take advantage of the Microsoft Cluster Server capabilities. This document describes how you can take full advantage of MSCS by writing resource dynamic-link libraries (DLLs), debugging your applications and services, and then installing them in a cluster environment.

Note: This White Paper assumes that you have successfully installed the Microsoft Cluster Server software in a cluster environment and also have the Microsoft Platform Software Development Kit (SDK) and the build environment working. If you are having trouble setting up the development environment, please refer to your SDK documentation.

Clustering and High Availability

Microsoft Cluster Server allows applications and services to run more efficiently on Windows NT Server by directing client requests based on resource availability and server load. (In the first release of MSCS, load balancing is done manually; future releases will provide automatic load balancing.) If one of the systems—or nodes—in the cluster is unavailable or has failed due to hardware or software problems, its workload is handled by other systems in the cluster until the failed systems are brought back online.

Note that Microsoft Cluster Server is designed to provide high availability, rather than true fault tolerance. The phrase "fault tolerant" is generally used to describe technology that offers a higher level of resilience and recovery. Fault-tolerant servers typically use a high degree of hardware or data redundancy, combined with specialized software, to provide near-instantaneous recovery from any single hardware or software fault. These solutions cost significantly more than a clustering solution because you must pay for redundant hardware that waits idly for a fault from which to recover. Microsoft Cluster Server provides a very good high-availability solution using standard, inexpensive hardware, while maximizing computing resources.

The Shared-Nothing Model

Microsoft Cluster Server version 1.0 is a two-node cluster that is based on the shared-nothing clustering model. The shared-nothing model dictates that while several nodes in the cluster may have access to a device or resource, the resource is owned and managed by only one system at a time. (In an MSCS cluster, a resource is defined as any physical or logical component that can be brought online and taken offline, managed in a cluster, hosted by only one node at a time, and moved between nodes.)

Each node has its own memory, system disk, operating system, and subset of the cluster's resources. If a node fails, the other node takes ownership of the failed node's resources (this process is known as failover). Microsoft Cluster Server then registers the network address for the resource on the new node so that client traffic is routed to the system that is available and now owns the resource. When the failed resource is later brought back online, MSCS can be configured to redistribute resources and client requests appropriately (this process is known as failback).

Note: When a node fails, any clients are disconnected. For the failover to be truly transparent, client applications must be written to reconnect in the event of node failure.

A generic MSCS cluster setup is shown in Figure 1, below.

Figure 1: . Standard two-node MSCS configuration

The following section provides an introduction to MSCS architecture.

Microsoft Cluster Server Architecture

Microsoft Cluster Server is comprised of three key components:

The Cluster Service
The Resource Monitor
Resource and Cluster Administrator extension DLLs

The Cluster Service

The Cluster Service (which is composed of the Event Processor, the Failover Manager/Resource Manager, the Global Update Manager, and so forth) is the core component of MSCS and runs as a high-priority system service. The Cluster Service controls cluster activities and performs such tasks as coordinating event notification, facilitating communication between cluster components, handling failover operations, and managing the configuration. Each cluster node runs its own Cluster Service.

The Resource Monitor

The Resource Monitor is an interface between the Cluster Service and the cluster resources, and runs as an independent process. The Cluster Service uses the Resource Monitor to communicate with the resource DLLs. The DLL handles all communication with the resource, thus shielding the Cluster Service from resources that misbehave or stop functioning. Multiple copies of the Resource Monitor can be running on a single node, thereby providing a means by which unpredictable resources can be isolated from other resources.

The Resource DLL

The third key Microsoft Cluster Server component is the resource DLL. The Resource Monitor and resource DLL communicate using the Resource API, which is a collection of entry points, callback functions, and related structures and macros used to manage resources. Applications that implement their own resource DLLs to communicate with the Cluster Service and that use the Cluster API to request and update cluster information are defined as cluster-aware applications. Applications and services that do not use the Cluster or Resource APIs and cluster control code functions are unaware of clustering and have no knowledge that MSCS is running. These cluster-unaware applications are generally managed as generic applications or services.

Both cluster-aware and cluster-unaware applications run on a cluster node and can be managed as cluster resources. However, only cluster-aware applications can take advantage of features offered by Cluster Server through the Cluster API. For example, cluster-aware applications can:

Report status upon request to the Resource Monitor.
Respond to requests to be brought online or to be taken offline gracefully.
Respond more accurately to IsAlive and LooksAlive requests.

MSCS includes two tools that perform basic cluster management: these tools are Cluster Administrator (CluAdmin.exe) and a command line management tool (Cluster.exe). You are encouraged to write your own custom management tools if needed; however, any lengthy discussion of managing cluster-unaware applications or developing cluster management tools is beyond the scope of this paper.

Figure 2 shows how the Cluster Service, Resource Monitor, and resource DLLs interact with each other on a single node running the Windows NT Server, Enterprise Edition operating system, cluster management applications, and both cluster-aware and cluster-unaware applications.

Figure 2: . MSCS components on a single node running Windows NT Server

Note that cluster-aware applications should also implement Cluster Administrator extension DLLs, which contain implementations of interfaces from the Cluster Administrator extension API. A Cluster Administrator extension DLL allows an application to be configured into the Cluster Administrator tool (CluAdmin.exe). Implementing custom resource and Cluster Administrator extension DLLs allows for specialized management of the application and its related resources, and enables the system administrator to install and configure the application more easily.

Next, this paper describes resources and resource DLLs in greater detail, and describes the reasons for writing a custom resource DLL.

Resources and Resource DLLs

To the Cluster Service, a resource is any physical or logical component that can be managed. Examples of resources are disks, network names, IP addresses, databases, Web sites, application programs, and any other entity that can be brought online and taken offline. Resources are organized by type. Resource types include physical hardware (such as disk drives) and logical items (such as IP addresses, file shares, and generic applications).

Every resource uses a resource DLL, a largely passive translation layer between the Resource Monitor and the resource. The Resource Monitor calls the entry point functions of the resource DLL to check the status of the resource and to bring the resource online and offline. The resource DLL is responsible for communicating with its resource through any convenient IPC mechanism to implement these methods.

Note that applications or services that do not provide their own resource DLLs can still be configured into the cluster environment—MSCS includes a generic resource DLL for just this purpose, and the Cluster Service treats these applications or services as generic, cluster-unaware applications or services. However, if an application or service needs to take full advantage of a clustered environment, it must implement a custom resource DLL that can interact with the Cluster Service and take full advantage of the rich features provided by Microsoft Cluster Server.

Why Write a Custom DLL?

One question that is asked frequently is, why should an application or service developer write a resource DLL given that Microsoft Cluster Server already ships with a resource DLL for generic applications and services? The primary reason is that the resource DLLs included with MSCS are very basic, "vanilla" DLLs that provide nothing more than rudimentary failover/failback capability. You can use these DLLs to test your applications or services to determine if they benefit from being in a clustered environment. However, if the applications or services do benefit, Microsoft recommends that you implement custom resource DLLs so that your applications and services can take full advantage of the clustering software.

For example, if an application has opened a number of files and the Cluster Service decides to move the application to another node for any number of reasons, the application has no way of cleaning up the first node before it is moved to the second node. An application-specific resource DLL can provide the functionality to shut down cleanly. A service would benefit from better control of resource polling (see the explanation of the LooksAlive and IsAlive routines later in this document). Also, some resources require that a specific set of parameters or properties (common and private) be passed or entered to bring the resource online, take it offline, and so forth. Such parameters and properties are added by implementing a Cluster Administrator extension that is tightly integrated with the resource type (which, in turn, is integrated with the resource DLL). Applications and services that do not implement resource DLLs must limit their parameters and properties to whatever the generic extensions provide.

Finally, the most important reason to provide an application-specific resource DLL is to support Active/Active failover and failback capability. This allows two separate instances of a resource type to be running on different nodes, each working with different data sets residing on a disk on the shared SCSI bus. (This means that although the data isn't shared, the resource type is active on both nodes. By definition, there can only be one instance of a resource; however, there can be multiple instances of a resource type.) If the instance of the resource type fails on one node, that instance is moved or "failed over" to the next available node. For example, say your resource type is a database manager application. You can run a copy of the database manager resource type on each node. You can then define a particular database (db1) as a resource. With Active/Active capability, you can move db1 from node 1 to node 2 by telling the database manager on each node to release and acquire the database, as appropriate. This communication could not happen with generic application or service resource types.

The Quorum Resource

In addition to the applications, services, and other resources described above, Microsoft Cluster Server allows for a special kind of resource, known as a quorum resource, which plays a crucial role in the operation of the cluster. A quorum resource must offer persistent arbitration mechanisms (that is, it must allow a single node to gain control of it and then must defend that node's control), and it must provide physical storage that can be accessed by any node in the cluster (although only one node can access this physical storage at any given time). The quorum resource maintains access to the most current version of the cluster database, and if a failure occurs, stores changes to the cluster database. Microsoft Cluster Server includes a standard quorum resource DLL for cluster-specific resources. While it is possible to implement a custom quorum resource, procedures and guidelines for writing a custom quorum resource DLL are beyond the scope of this paper.

Next, this paper describes the programming tools you will need to create a cluster-aware application, and then provides guidelines for creating a new resource type and for writing a custom resource DLL.

Creating a Cluster-Aware Application

In Microsoft Cluster Server version 1.0, a cluster-aware application is one that uses the Cluster APIs, cluster control code functions, and Resource API (implemented in a resource DLL) to communicate with the MSCS cluster software and to take advantage of clustering capabilities.

The Cluster APIs allow cluster-aware applications, legacy applications, and services to interact with the cluster software.
The cluster control codes are 32-bit values that describe an operation performed on a network, network interface, resource, resource type, group, or node.
The Resource API defines functions, structures, and macros that allow the Cluster Service to communicate with a resource.

Next, these components are described in more detail.

The Cluster APIs

The Cluster APIs allow an application to retrieve information about cluster objects, initiate operations, and update cluster database information. There are seven sets of Cluster APIs:

Cluster Management, which provides access to event notification, cluster objects, and overall cluster state information.
Cluster Database Management, which allows a cluster-aware application or resource DLL to access and update the cluster database. (The cluster database is implemented as a part of the Windows NT registry, and is resident on each cluster node. It contains information about all physical and logical elements in a cluster.) Changes to the cluster database using these APIs should only be made by the Cluster Service and by resource DLLs. Note that properties of a cluster object should be managed using control functions. Only resource DLLs and applications controlled by resource DLLs should use the Cluster Database Management APIs.
Group Management, which provides access to each of the groups in a cluster and allows callers to change a group's membership or state and retrieve information. These functions are implemented primarily by the Cluster Service's Resource and Failover Managers.
Network Interface Management, which opens and closes the network interface, perform selected operations, and retrieve information.
Network Management, which provides access to information about networks that are monitored by the Cluster Service.
Node Management, which allows callers to change a node's state, perform operations, and retrieve information. These functions are implemented primarily by the Membership Manager.
Resource Management, which allows callers to perform a variety of operations on one or more resources, including retrieval of dependency information, creation and deletion of resources, and initiation of operations defined by resource control codes.

Cluster APIs are also used by cluster management tools, such as Cluster Administrator (CluAdmin.exe) that ships with MSCS, and by Resource Monitors and resource DLLs.

Figure 3, below, shows how the Cluster APIs are used in a cluster environment.

Figure 3: . Cluster API access in a cluster environment

Cluster Control Codes

Cluster control codes are 32-bit values used to describe an operation performed on a cluster object, such as a resource, resource type, group, node, or network or network interface. Cluster control codes are categorized as either internal or external. Internal control codes are used by the Cluster Service only, and applications and resource DLLs cannot use them. Internal control codes are typically sent by the Cluster Service to notify a resource or resource type of an event.

External codes represent operations that can be performed by applications. A small subset of these operations are used to manage the cluster properties. Cluster properties are attributes that describe a cluster object such as a resource, resource type, group, or node. Two types of properties exist: common and private. Common properties are static and apply to all objects of a particular type, such as a resource type. For example, the RestartAction property is common for all resources. Common properties are stored in the cluster database. Private properties can either be static or dynamic data, and are used to describe a particular type of resource. These values are stored in the cluster database or in an alternate location. Cluster properties are further divided into read-only and read-write properties.

For more information about cluster control codes, cluster objects, and cluster properties, refer to your Platform SDK documentation.

The Resource API

The Resource API defines functions, structures, and macros that allow the Cluster Service to communicate with resources. The communication is indirect; the Cluster Service initiates requests with a Resource Monitor, and the Resource Monitor passes them on to the resource. Status and event information is passed back from the resource DLL to the Cluster Service.

Figure 4 shows how control flows from the Cluster Service through the Resource Monitor and a resource DLL to the resources. The diagram shows four resource DLLs, three for Cluster Service resource types and one for a resource type defined by a third-party developer.

Figure 4: . Resource control flows

Figure 4: . Resource control flows

The Resource API consists of:

Entry point functions, which allow a Resource Monitor to manage its resources.
Callback functions, which allow a resource DLL to report status and to log events and allow the Cluster Service to request that a resource perform specific tasks.
Structures and macros, which are used to describe the function table that is returned by the Startup entry point function and describe the status of a resource.

Because a cluster-aware application is supported by writing a custom resource, you must define its resource type. And, as stated previously in this document, if your application is to be managed as a custom resource, you must provide two DLLs—a resource DLL and a Cluster Administrator extension DLL. (Note that the Cluster Administrator extension DLL is only required if the resource requires properties. An example of a resource type that doesn't require properties is the Time Service resource type.)

Next, you'll begin writing your resource type DLL.

Writing Resource DLLs

Getting Started

You can use either the Microsoft Visual C++® development system or other C/C++ development tools to write custom resource DLLs. For the examples used in this White Paper, we used:

Microsoft Visual C++ version 4.2b, which includes the Unicode MFC libraries.
The MIDL compiler, version 3.00.44. (The MIDL compiler is available with the Microsoft Platform SDK.)
The Active Template Library (ATL) version 2.0 (required for the extension sample code and the code that is generated by the Resource Type App Wizard).

When setting up your build environment, please refer to the Platform SDK, particularly the "Preparing a Build Environment" and "Developer Notes" sections.

You will find a reference implementation of a complete resource DLL in the samples\WinBase\Cluster\SMBsmp directory (in the Platform SDK root directory).

Creating a new Resource Type

To create a new resource type, you must write a resource DLL and a Cluster Administrator extension DLL. The easiest way to build a resource DLL is to run the Resource Type AppWizard. This builds a skeletal resource DLL and/or Cluster Administrator extension DLL with all the entry points defined, declared, and exported.

For complete instructions when creating the resource DLL, see the SDK sections "Creating a Custom Resource Type," "Using the Resource Type App Wizard," and "Customizing a Resource DLL."

This skeletal resource DLL will provide only the most basic failover and failback capability. You will need to customize this code to allow your resource to take full advantage of the cluster environment and to allow the DLL to provide information specific to your resource. Note that the skeletal DLL code generated by the App Wizard is marked with TODO: and ADDPARAM: comments to indicate where you need to add resource-specific information. You will use the Resource API for much of this customization, as described next.

Customizing Your Resource DLL

As described previously, the Resource API consists of several entry point functions that are implemented in a resource DLL. The Resource Monitor uses these entry points to manage resources supported by the DLL. In addition, the Resource Monitor implements a few callback functions that the resource DLL uses to report status to the Cluster Service or to log events for the system administrator.

Most of the entry point functions are required for all resources. Two specific API entry point functions—Arbitrate and Release—are required if, and only if, you are writing a resource DLL for a quorum resource. These two functions are not discussed in this document. The remaining entry point functions are listed below and are discussed in detail in this paper.

Startup
Open
Online
LooksAlive
IsAlive
Offline
Close
Terminate
ResourceControl
ResourceTypeControl

Each resource DLL supported by the cluster software should adhere to the following guidelines:

With one exception, a resource DLL is nonreentrant for a given instance of a resource. The exception is the Terminate entry point function. Terminate can be called at any time, even when other threads within the resource DLL are blocked waiting for an Online or Offline call to complete.
The resource DLL is reentrant with respect to other resource IDs. If a resource DLL handles more than one resource ID, it must synchronize between them for any global data shared within the resource DLL.
In the resource DLL, an entry point function should take no more than 300 milliseconds to complete. If an entry point—specifically, Online, Offline, LooksAlive, or IsAlive—will exceed this limit, your DLL should spawn separate threads wherever possible to handle the lengthier operations. (Note that the AppWizard currently generates an Online thread, and future versions will generate an Offline thread as well.)

During initialization of the resource DLL, the DLL entry point is called with the DLL_PROCESS_ATTACH flag. The Resource Monitor then begins to invoke the Resource API entry point functions.

The Startup Routine

Once the resource DLL is loaded, the Resource Monitor invokes the Startup routine. Note that only the Startup entry point function is exported. All other entry points that are implemented in the resource DLL are accessed through the function table that Startup returns.

The Startup routine is defined as:

DWORD WINAPI Startup( 
LPCWSTR ResourceType, 	  
DWORD MinVersionSupported,	 
DWORD MaxVersionSupported, 
PSET_RESOURCE_STATUS_ROUTINE SetResourceStatus,	 
PLOG_EVENT_ROUTINE LogEvent,	 
PCLRES_FUNCTION_TABLE * FunctionTable	  
);

The ResourceType parameter identifies the type of resource to be started.

The parameters SetResourceStatus and LogEvent are pointers to callback functions implemented by the Resource Monitor. (These callback functions are described more fully later in this White Paper.) After your resource DLL is called in the Online or Offline entry point functions, your DLL should call SetResourceStatus to inform the cluster of the state of the resource if the command will take longer than 300 milliseconds to complete; your DLL also should use the LogEvent routine to report events and errors. (SetResouceStatus should only be called by Online or Offline, and then only if Online or Offline returns ERROR_IO_PENDING. For more information, see the discussions of the Online and Offline entry point functions.)

The FunctionTable structure contains the function addresses of the rest of the entry points in the resource DLL.

Note that the Startup entry point function is the only place for resource DLLs to save the addresses of the callbacks LogEvent and SetResourceStatus.

Startup returns the following values:

If the request was successful, Startup returns ERROR_SUCCESS.
If the resource does not support a version that falls in the range identified by the MinVersionSupported and MaxVersionSupported parameters, Startup returns ERROR_REVISION_MISMATCH.
If the operation is unsuccessful, Startup should return a Microsoft Win32® application programming interface error value.

For optimal operation of the resource, make sure that your implementation of Startup finishes in less than 300 milliseconds.

The Open Routine

Once Startup returns successfully, the Resource Monitor typically calls the Open entry point function for each resource managed by the resource DLL.

The Open routine is defined as:

RESID WINAPI Open(
LPCWSTR ResourceName,
HKEY ResourceKey,
RESOURCE_HANDLE ResourceHandle 
);

The ResourceName parameter identifies the specific resource to be opened (a resource DLL can support more than one resource of a given type). The ResourceKey parameter refers to the resource-specific information, private properties, and so forth, in the cluster database. This key is closed upon return from the Open entry point; therefore, if a key is required in other entry points for the resource, your DLL should call either the ClusterRegOpenKey API or the ClusterRegCreateKey API. The ResourceHandle parameter is used in calls to the SetResourceStatus and LogEvent callback functions.

Open uses the Cluster API to open the cluster database and retrieve the resource parameters and private properties. One of the most important things that your resource DLL should check for in the Open entry point is that the resource is currently offline (a resource cannot be simultaneously online on more than one node). If the resource is currently online, your DLL should attempt to take it offline. (Note that in this case, online and offline refer to the status of the application or service, not the status of a given cluster node—the resource must be truly offline and not currently owned by a cluster node.) In addition, your resource DLL should create any resource-specific data structures during the Open routine.

(Note that the Cluster APIs are not available in the Open routine if the resource is the quorum resource.)

Open returns the following values:

If the operation was successful, Open returns a resource identifier (RESID).
If the operation was unsuccessful, Open returns NULL. The SetLastError API should be called to specify the error that occurred.

If Open returns an error (by returning NULL), the resource for which it was called becomes unmanageable. Therefore, Open should return an error only in very rare cases (for example, if it cannot allocate the memory it needs to represent the resource).

For optimal operation of the resource, make sure that your implementation of Open finishes in less than 300 milliseconds.

The Online Routine

Once the resource is opened, the Resource Monitor calls the Online entry point function to bring the resource online.

The Online routine is defined as:

DWORD WINAPI Online(
RESID ResourceId,
PHANDLE EventHandle
);

The ResourceId parameter is passed to the entry point and uniquely identifies the resource. (This is the same ResourceId that was returned from the Open entry point function.) The EventHandle is a parameter that the resource DLL can pass back to the Resource Monitor so that it can asynchronously notify the Resource Monitor of its status. If the EventHandle parameter is not set to a valid handle that can be signaled, then the Resource Monitor will call the resource DLL's LooksAlive entry point periodically to determine the status of the resource. If you do not want your resource to be interrupted in this manner, your DLL should return a valid handle in the EventHandle parameter. After returning a valid EventHandle, the resource DLL can notify the Resource Monitor of any status change.

Every resource type must have its own implementation of the Online entry point function. These different implementations are necessary because of the varying needs of different types of resources. For example, bringing a disk online is completely different from bringing a generic application online. Bringing a disk online involves mounting the disk, verifying disk signatures, and so forth, whereas bringing an application online could be as simple as calling CreateProcess.

Online returns the following values:

If the operation was successful and the resource is now online, Online returns ERROR_SUCCESS.
If the resource was arbitrated with some other systems, and one of the other systems won the arbitration, Online returns ERROR_RESOURCE_NOT_AVAILABLE.
If the request is pending and a thread has been activated to process the request, Online returns ERROR_IO_PENDING.
If the operation was unsuccessful, Online returns a Win32 error value.

For optimal operation of the resource, make sure that your implementation of the Online routine finishes processing within 300 milliseconds. If that is not possible, the entry point can immediately return ERROR_IO_PENDING to the Resource Monitor after spawning a separate worker thread to bring the resource online. Once this worker thread begins the process of bringing the resource online, the SetResourceStatus callback function (whose address was stored in the Startup entry point function) should be called to periodically indicate the resource status. As soon as the resource is online, the worker thread can be terminated or suspended for future use.

If for any reason the resource fails to come online, your resource DLL should use the LogEvent callback function to log the event, and should call the SetResourceStatus function. SetResourceStatus uses a RESOURCE_STATUS structure to indicate the online or failure state of the resource. A cluster resource can be in one of the following states:

Online—the status code is ClusterResourceOnline
Offline—the status code is ClusterResourceOffline
Failed—the status code is ClusterResourceFailed
Online Pending—the status code is ClusterResourceOnlinePending
Offline Pending—the status code is ClusterResourceOfflinePending

For more information on SetResourceStatus, please refer to the Platform SDK.

If, after three minutes, the resource does not come online, the Resource Monitor will call the Terminate entry point to abort the operation. If a resource takes more than three minutes to come online, the Cluster Administrator, Cluster.exe, or some other management tool that uses the Cluster API can use the ClusterResourceControl function to modify the PendingTimeout common property by specifying the common properties control code CLUSCTL_RESOURCE_SET_COMMON_PROPERTIES. (For more information on cluster control codes, common properties, private properties, and so forth, refer to the SDK documentation.)

The LooksAlive and IsAlive Routines

Once the resource is online, the Resource Monitor will poll the resource periodically to determine its status. The Resource Monitor uses the LooksAlive and IsAlive entry point functions to accomplish this. LooksAlive is a cursory check performed by the Resource Monitor, and IsAlive is a more thorough check.

The LooksAlive routine is defined as follows:

BOOL WINAPI LooksAlive(
RESID ResourceId 
);

The IsAlive routine is defined as follows:

BOOL WINAPI IsAlive(
RESID ResourceId 
);

In both functions, the ResourceId parameter uniquely identifies the resource instance that is being polled. Typically, LooksAlive is used to perform simple checks (such as determining if the process is still running, a file share is still present, and so forth), and the Resource Monitor makes frequent calls to LooksAlive. If you do not want your resource DLL to be interrupted by these calls, return a valid EventHandle in the Online routine (as noted previously in this paper), and then use this handle to notify the Resource Monitor of the resource status.

The IsAlive entry point function is used for more thorough resource status evaluation, and is regularly polled by the Resource Monitor (this polling cannot be turned off). The resource DLL should do a complete check of the resource and see if it is functioning properly. For example, a database resource should check to see that the database can write to the disk, perform queries and updates to the disk, and so forth.

LooksAlive returns the following values:

If the resource is probably online and available for use, LooksAlive returns TRUE.
If the resource is not functioning properly, LooksAlive returns FALSE.

IsAlive returns the following values:

If the resource is online and functioning properly, IsAlive returns TRUE.
If the resource is not functioning properly, IsAlive returns FALSE.

For optimal operation of the resource, your implementation of the IsAlive entry point should take no more than 300 milliseconds to complete. As described previously, if the entry point function will exceed the 300 millisecond limit, you should create a separate worker thread to complete the check of the resource. The worker thread can then post the status so that IsAlive can retrieve it and return it to the Resource Monitor.

Note that LooksAlive should never exceed 300 milliseconds under any circumstances. In most cases, it should take no more than 150 milliseconds to complete. IsAlive can take longer, but since it is a synchronous call, entry points for other resources that the Resource Monitor manages in the same thread cannot be called until IsAlive returns. Spawning a separate thread will not help in this situation. Therefore, IsAlive should take no more than 300 milliseconds to complete also.

The Offline Routine

The entry point functions discussed thus far provide the core custom functionality of a good working resource DLL. The next entry points provide more of a clean-up and unloading mechanism for resources. The first one of these entry points is the Offline function.

The Offline routine is defined as:

DWORD WINAPI Offline(
RESID ResourceId 
);

The ResourceId parameter uniquely identifies the resource. The Resource Monitor calls this entry point when the resource is to be taken offline. Once offline, the resource is made unavailable to the cluster's clients.

Offline returns the following values:

If the request completed successfully and the resource is offline, Offline returns ERROR_SUCCESS.
If the request is still pending and a thread has been activated to process the offline request, Offline returns ERROR_IO_PENDING.
If the operation was unsuccessful for other reasons, Offline should return a Win32 error value.

The resource DLL should take no more than 300 milliseconds to gracefully shut down the resource and return from the entry point. If your implementation of the Offline routine exceeds this limit, it should return ERROR_IO_PENDING and spawn a separate thread to complete the Offline request. This worker thread should use the SetResourceStatus callback routine to constantly update the Resource Monitor of the resource status until the resource returns a status of ClusterResourceOffline.

If the resource does not shut down gracefully within the PendingTimeout time or if the Offline function returns a Win32 error value, then the Resource Monitor will call the Terminate entry point function to forcefully terminate the resource.

The Close Routine

The Close entry point function closes a resource, and will be called only once for a particular resource. You should use Close to deallocate any of the structures that were allocated by the Open, Offline, ResourceControl, or ResourceTypeControl entry point functions. If the resource to be closed is not yet offline, call Terminate to forcibly take it offline.

The Close routine is defined as:

VOID WINAPI Close(
RESID ResourceId
);

The ResourceId parameter uniquely identifies the resource to be closed.

Close returns no values.

Your resource DLL should take no more than 300 milliseconds to close the resource. However, if this time limit is exceeded, the Cluster Service handles the situation properly.

The Terminate Routine

The Terminate entry point function is used to immediately end a process that will not shut down gracefully when the Offline function is called.

The Terminate routine is defined as:

 
VOID WINAPI Terminate(
RESID ResourceId
);

The ResourceId uniquely identifies the resource to be forced offline. If the resource DLL has worker threads waiting to bring the resource online or to gracefully take the resource offline, that operation is aborted and the resource is taken offline.

Terminate returns no values.

The ResourceControl and ResourceTypeControl Routines

The ResourceControl and ResourceTypeControl entry point functions are optional. However, Microsoft recommends that resource DLLs implement these entry points to support the cluster resource control codes. Management tools, such as Cluster Administrator and Cluster.exe, and cluster-aware applications use the ClusterResourceControl and ClusterResourceTypeControl functions to communicate arbitrary information to the resource. For example, these functions can be used to set properties (common and private), request operations, and so forth. When a management or cluster-aware application calls either of these functions, the Resource Monitor invokes the ResourceControl or ResourceTypeControl functions, respectively, and the appropriate control code is passed to these entry points. Resource DLLs that implement these entry points should perform the control request specified by the control code or retrieve or set the property of the resource. Control codes that are not handled by the resource DLL should return the appropriate status (by returning ERROR_INVALID_FUNCTION status) back to the Resource Monitor, which in turn will perform the default action, if there is one.

The ResourceControl routine is defined as

DWORD WINAPI ResourceControl(
RESID ResourceId, 
DWORD ControlCode, 
LPVOID InBuffer, 
DWORD InBufferSize, 
LPVOID OutBuffer, 
DWORD OutBufferSize, 
LPDWORD BytesReturned
);

The ResourceId parameter identifies the affected resource. ControlCode represents the control code for the operation to be performed. For a list of the valid values for the ControlCode parameter, see "Control Codes for Resources" in the Platform SDK.

InBuffer is a pointer to a buffer containing data to be used in the operation, and InBufferSize is the size, in bytes, of the buffer pointed to by InBuffer. OutBuffer is a pointer to a buffer containing data resulting from the operation, and OutBufferSize is the size, in bytes, of the available space pointed to by OutBuffer. Note that InBuffer and OutBuffer can be NULL if the operation doesn't require or return data.

BytesReturned is the number of bytes in the buffer pointed to by OutBuffer that actually contain data.

ResourceControl returns the following values:

If the operation associated with ControlCode was completed successfully, ResourceControl typically returns ERROR_SUCCESS (although the actual returned value depends on the control code).
If the resource DLL does not support the operation represented by ControlCode, or if the Resource Monitor must process the request, ResourceControl returns ERROR_INVALID_FUNCTION.
If the operation was unsuccessful, ResourceControl returns a Win32 error value.

The ResourceTypeControl routine is defined as

DWORD WINAPI ResourceTypeControl(
LPCWSTR ResourceTypeName, 
DWORD ControlCode, 
LPVOID InBuffer, 
DWORD InBufferSize, 
LPVOID OutBuffer, 
DWORD OutBufferSize, 
LPDWORD BytesReturned
);

The ResourceTypeName parameter identifies the type of resource to be affected by the operation. ControlCode represents the control code for the operation to be performed. For a list of the valid values for the ControlCode parameter, see "Control Codes for Resources" in the Platform SDK.

BytesReturned is the number of bytes in the buffer pointed to by OutBuffer that actually contain data.

ResourceTypeControl returns the following values:

If the operation indicated by ControlCode was completed successfully, ResourceTypeControl typically returns ERROR_SUCCESS (although the actual returned value depends on the control code).
If the resource DLL does not support the operation represented by ControlCode, or if the Resource Monitor must process the request, ResourceTypeControl returns ERROR_INVALID_FUNCTION.
If the operation was unsuccessful, ResourceTypeControl should return a Win32 error value.

Debugging A Resource DLL

Because the cluster service runs as a Windows NT service, and the Resource Monitor runs as a separate process, you may find that debugging resource DLLs can be a little more complicated than debugging regular Windows NT-based DLLs or applications.

You can use any standard Microsoft Windows® operating system-based debugger, along with the DebugEx Cluster Administrator extension DLL, to debug a resource DLL. The DebugEx extension DLL allows the resource DLL to be debugged the next time the Resource Monitor is started. (The DebugEx extension is included with the product.)

To debug a resource DLL with WinDbg, Microsoft Developer Studio, or any other Windows-based debugger, the Cluster Service must be running in the same security context as the logged-on user. (Note that being logged on with the same account as the Cluster Service account does not mean that you are running in the same security context as the Cluster Service.) To ensure that you are running in the same security context as the Cluster Service, you must stop the Cluster Service, and then start the Cluster Service locally from the command line using the –debug switch. Be aware that if there are two nodes in the cluster and the other node is online, the debugger may be started on the other node, if that's where the resource or resource type being debugged is loaded.

Before invoking the debugger, be sure to copy the symbol files from the \Symbols directory on the SDK CD-ROM to your %windir%\Symbols directory. Also, be sure to copy any .pdb files for your resource DLL to the same location as your resource DLL.

When you debug a resource DLL, you can debug either the resource type or the resources that belong to the resource type, or both, as explained next.

To debug the resource type:

Set the resource type's DebugControlFunctions property to TRUE.
Set the resource type's DebugPrefix property to the path to your debugger.

When your ResourceTypeControl entry point function is called, the Cluster Service checks the settings for these properties. If DebugControlFunction is set to TRUE and DebugPrefix contains a valid path, the Cluster Service creates a new Resource Monitor process for ResourceTypeControl and attaches the specified debugger to it.

To debug the resource DLL:

Register the DebugEx Cluster Administrator DLL.
Start Cluster Administrator.
Create a new resource of the type supported by your resource DLL. Specify that the resource should run in a separate Resource Monitor. (A dedicated Resource Monitor is required during debugging to help you isolate problems and to ensure that other resources remain unaffected by the debugging process.)
Select the new resource, and then choose Properties from the File menu.
Click the Debug tab.
In the Debug Command Prefix edit control, type the full path to your debugger. For example, if you are using MSDEV, type

c:\msdev\bin\msdev.exe

Press the OK button.
Stop the Cluster Service; to do this type

net stop clussvc

Start the Cluster Service locally by typing the following command from the cluster directory:

start clussvc –debug

This command causes a command box to be displayed with the output from the Cluster Service. As soon as the Cluster Service starts, it will start the Resource Monitor with your resource in it and attach your debugger to it. At that point, your debugger will be invoked. You can now set breakpoints in the DLL. For example, to debug the ResourceTypeControl entry point function, attach a debugger to the main Resource Monitor process, and set a breakpoint at s_RmResourceTypeControl. You can set additional breakpoints in the DLL once the call to LoadLibrary is stepped over.

The Platform SDK will allow you to use the DebugEx extension DLL to debug the Startup and Open entry point functions by using the DebugEx extension and specifying the DebugPrefix value for the resource type of interest. While the DebugEx extension is not absolutely required, it allows you to use Cluster Administrator to turn debugging on and off on a resource-by-resource basis or for all resources of a particular type. Once the extension has been registered, the Debug tab will appear on all resource types as well. If you want to debug startup code in a resource DLL while the resource is being created, set the DebugPrefix property on the resource type beforehand. You can do this with Cluster.exe or by bringing up properties on the resource type, clicking the Debug tab, and specifying the debugger. Then, when a resource is created, check the Use Separate Resource Monitor check box, and click the Next button to start the debugger immediately.

When a resource of the specified type is created, the Resource Monitor will wait until the debugger has attached and will also call DebugBreak after calling LoadLibrary. Some debuggers (such as CDB and WinDbg) will also break when they attach, while others (such as MsDev) will not. In any case, the Resource Monitor will display a message in the debugger's output window when the debugger attaches and after loading the resource DLL.

Installation and Setup Issues

A cluster-aware application installation typically consists of three logical setup programs:

Installing the actual program
Configuring the cluster portion of the program
Installing and registering the client-side administrator extensions

These operations should be integrated into a single setup program, which should also provide the option of performing the above-mentioned operations individually.

Much of the configuration portion of the setup is trivial and depends entirely on the user preferences, which are entered during setup of the administrator extensions. Microsoft recommends that each cluster-aware application that implements a resource DLL should also implement a Cluster Administrator extension DLL.

The following sections discuss each portion of the installation in greater detail.

Installing the Program

This portion of the setup program installs a copy of the application on each node. The only change required to the setup process is that it must check to see if the application is being installed on a cluster node. Your setup application can call the OpenCluster cluster function to determine if setup is running in a clustered environment. Note that OpenCluster only tells the setup program whether the Cluster Service is currently running. Therefore, if OpenCluster returns a NULL response, your setup program should also check the database of installed services to determine if the Cluster Service is installed, but turned off. If the Cluster Service is not running, the setup program should give the user the option of starting the Cluster Service before proceeding with the installation. Once the application is installed on one node, the setup program could enumerate the other nodes and then use the administration shares and remote registry manipulation to install the application over the network.

In some cases, an application may have been previously installed on a nonclustered machine. In this case, the setup program should provide additional setup options if the system is transformed into a cluster node. Your setup program should allow the application's data files (not program files—see below) to be migrated to a disk on the shared SCSI bus, register the new resource type with the cluster, and so forth. However, you can choose to simply require the application to be completely reinstalled.

Note that you should not store program files on drives connected to the cluster's shared SCSI bus. Although these drives may seem to be the ideal location for program files because all nodes in the cluster can see these disks, storing files there will cause problems when you upgrade either Microsoft Cluster Server or the application software, especially if the cluster is in a production environment. Such upgrades would require the cluster to be shut down completely. If you choose instead to install program files on all nodes in the cluster, you can use a "rolling upgrade" approach—upgrading each node individually—without affecting the operation of the cluster.

Configuring the Cluster

During this phase in the installation process, the configuration portion of your setup program should create the resource type by calling the cluster management function CreateClusterResourceType, and installing the corresponding resource DLL(s) that support the resource type on each node. To install a DLL, the setup program should copy it to an application installation directory on the destination drive, and register them with their full path.

Note: The DLL should be copied to all nodes in the cluster prior to the resource type being registered. If this is not done, any instances of Cluster Administrator attached to the cluster node that doesn't have access to the resource DLL will display an error when it is notified that the resource type was added and it attempts to read the properties of the resource type.

Lastly, your program should register the Cluster Administrator extension DLL with the cluster. To register a Cluster Administrator extension with a cluster, implement the DllRegisterCluAdminExtension entry point function (in the Cluster Administrator extension). To register a Cluster Administrator extension in the system registry, implement the DllRegisterServer entry point function.

The network administrator may combine your custom resource type with a Network Name resource and its dependent resources to create a virtual server that clients can access. Your setup program should call the appropriate Cluster API functions to make sure that the network administrator places all of the necessary dependencies in the same group. You should also configure the resource to use those dependencies. For example, if your newly created cluster-aware application relies on data files, determine which drive will be used.

Installing and Registering the Client-side Administrator Extensions

The client-side Cluster Administrator extension is the portion of the administration extension that the Cluster Administrator tool uses. The setup program should copy it onto the machine that will run the Cluster Administrator tools and that will be registered as an Inproc Server.

Caveats for Writing Good Resource DLLs

Writing good resource DLLs can be very simple if you pay careful attention to a few rules and guidelines, many of which have been discussed in this paper. This section provides a summary of some of the key things to remember when you write your DLLs.

Use Visual C++ version 4.2b or greater (make sure to install the latest updates from www.microsoft.com) and the Platform SDK to write resource DLLs and Cluster Administrator extension DLLs. This development environment is tightly integrated to make writing them very easy. Similarly, use the Visual C++ debugger or WinDbg to debug these DLLs.
Initialize global data used by the resource DLL in Startup. Initialize resource-specific data in the Open entry point.
Within a resource DLL, make use of separate threads to perform lengthy operations, such as opening resources, bringing resources online, taking them offline, and so on. Creating separate threads makes the process of communicating with the Resource Monitor (and in turn, the Cluster Service) more efficient. If, for example, a resource will take more than 300 milliseconds to come online, spawn a separate thread to complete the process, allowing the entry point to return immediately.
If you don't want your resource DLLs to be interrupted by the Cluster Service polling for resource status, return an event handle to inform the service not to call the LooksAlive entry point. Then, use the event handle to signal status information to the Cluster Service.
Make use of callback functions whenever possible. Callback functions allow for asynchronous communication between the resource DLL and the Resource Monitor. Typically, the LogEvent and the SetResourceStatus functions are the callback functions that regular resource DLLs use to record resource events with the Cluster Service and for debugging purposes. SetResourceStatus is called to communicate resource status information to the Resource Monitor.
If your resources need to be notified of cluster-specific events, node-specific events, group state changes, and cluster database changes or updates, create a cluster notification port (use CreateClusterNotifyPort) to handle event notifications. Note that there is a potential for a race condition if multiple threads use notification ports. For example, one thread could be calling CloseClusterNotifyPort to close the notification port while another thread, which has called GetClusterNotify, is waiting to retrieve information from the same port. You can prevent this problem by having the thread that is calling GetClusterNotify check for a CLUSTER_CHANGE_HANDLE_CLOSE event. For more information on event notifications and the events that cause them, please refer to the Platform SDK.
Support the CLUSCTL_RESOURCE_GET_PRIVATE_PROPERTIES control function so that management tools such as Cluster.exe can set properties that haven't been set yet.
Support the CLUSCTL_RESOURCE_VALIDATE_PRIVATE_PROPERTIES and CLUSCTL_RESOURCE_SET_PRIVATE_PROPERTIES control functions so that your resource DLL can validate properties before they are saved.
Support the CLUSCTL_RESOURCE_TYPE_GET_PRIVATE_PROPERTIES, CLUSCTL_RESOURCE_TYPE_VALIDATE_PRIVATE_PROPERTIES, and CLUSCTL_RESOURCE_TYPE_SET_PRIVATE_PROPERTIES control functions if your resource type has private properties.
Support the CLUSCTL_RESOURCE_GET_REQUIRED_DEPENDENCIES and CLUSCTL_RESOURCE_TYPE_GET_REQUIRED_DEPENDENCIES control functions if your resources require a dependency on another resource.
Properties should be read in the Open function if possible. If this isn't possible, they should be read in the Online function.

Conclusion

Writing a resource DLL is just one part of making an application or service cluster-aware. A cluster-aware application, such as a database, may implement a custom, application-specific resource DLL to manage the database resource. This same application may use other resource DLLs (for disk resources, IP addresses, and so on) to enable high availability and the other features provided by the Cluster Service. The application may make use of the Cluster API and other cluster-related functions (such as the cluster control codes) to be more cluster aware. Finally, a cluster-aware application should provide administrators with a Cluster Administrator extension DLL so that the application can be easily set up and configured in the cluster environment.

With proper tools, such as Microsoft Visual C++ and the Platform SDK, writing resource DLLs is simple and straightforward. Adherence to the guidelines and suggestions outlined in this document should allow the application, cluster, and resource to interact more efficiently, and should ensure that the cluster provides high availability across all applications and system resources.

For More Information

For the latest information on Windows NT Server, check out our World Wide Web site at https://www.microsoft.com/ntserver or the Windows NT Server Forum on the Microsoft Network (GO WORD: MSNTS).

The information contained in this document represents the current view of Microsoft Corporation on the issues discussed as of the date of publication. Because Microsoft must respond to changing market conditions, it should not be interpreted to be a commitment on the part of Microsoft, and Microsoft cannot guarantee the accuracy of any information presented after the date of publication.

This White Paper is for informational purposes only. MICROSOFT MAKES NO WARRANTIES, EXPRESS OR IMPLIED, IN THIS DOCUMENT.

Other product or company names mentioned herein may be the trademarks of their respective owners.