Appendix (Guide to Creating and Configuring a Server Cluster under Windows Server 2003 White Paper)

Applies To: Windows Server 2003 with SP1

Advanced Testing

Now that you have configured your cluster and verified basic functionality and failover, you may want to conduct a series of failure scenario tests that will demonstrate expected results and ensure the cluster will respond correctly when a failure occurs. This level of testing is not required for every implementation, but may be insightful if you are new to clustering technology and are unfamiliar how the cluster will respond or if you are implementing a new hardware platform in your environment. The expected results listed are for a clean configuration of the cluster with default settings, this does not take into consideration any user customization of the failover logic. This is not a complete list of all tests, nor should successfully completing these tests be considered “certified” or ready for production. This is simply a sample list of some tests that can be conducted. For additional information, see the following article in the Microsoft Knowledge Base:

197047 Failover/Failback Policies on Microsoft Cluster Server

Test: Start Cluster Administrator, right-click a resource, and then click “Initiate Failure”. The resource should go into an failed state, and then it will be restarted and brought back into an online state on that node.

Expected Result: Resources should come back online on the same node

Test: Conduct the above “Initiate Failure” test three more times on that same resource. On the fourth failure, the resources should all failover to another node in the cluster.

Expected Result: Resources should failover to another node in the cluster

Test: Move all resources to one node. Start Computer Management, and then click Services under Services and Applications. Stop the Cluster service. Start Cluster Administrator on another node and verify that all resources failover and come online on another node correctly.

Expected Result: Resources should failover to another node in the cluster

Test: Move all resources to one node. On that node, click Start, and then click Shutdown. This will turn off that node. Start Cluster Administrator on another node, and then verify that all resources failover and come online on another node correctly.

Expected Result: Resources should failover to another node in the cluster

Test: Move all resources to one node, and then press the power button on the front of that server to turn it off. If you have an ACPI compliant server, the server will perform an “Emergency Shutdown” and turn off the server. Start Cluster Administrator on another node and verify that all resources failover and come online on another node correctly. For additional information about an Emergency Shutdown, see the following articles in the Microsoft Knowledge Base:

325343 HOW TO: Perform an Emergency Shutdown in Windows Server 2003

297150 Power Button on ACPI Computer May Force an Emergency Shutdown

Expected Result: Resources should failover to another node in the cluster

Important

Performing the Emergency Shutdown test may cause data corruption and data loss. Do not conduct this test on a production server

Test: Move all resources to one node, and then pull the power cables from that server to simulate a hard failure. Start Cluster Administrator on another node, and then verify that all resources failover and come online on another node correctly

Expected Result: Resources should failover to another node in the cluster

Important

Performing the hard failure test may cause data corruption and data loss. This is an extreme test. Make sure you have a backup of all critical data, and then conduct the test at your own risk. Do not conduct this test on a production server

Test: Move all resources to one node, and then remove the public network cable from that node. The IP Address resources should fail, and the groups will all failover to another node in the cluster. For additional information, see the following articles in the Microsoft Knowledge Base:

286342 Network Failure Detection and Recovery in Windows Server 2003 Clusters

Expected Result: Resources should failover to another node in the cluster

Test: Remove the network cable for the Private heartbeat network. The heartbeat traffic will failover to the public network, and no failover should occur. If failover occurs, please see the “Configuring the Private Network Adapter” section in earlier in this document

Expected Result: There should be no failures or resource failovers

SCSI Drive Installations

This appendix is provided as a generic set of instructions for SCSI drive installations. If the SCSI hard disk vendor’s instructions conflict with the instructions here, always follow the instructions supplied by the vendor.

The SCSI bus listed in the hardware requirements must be configured prior to cluster service installation. Configuration applies to:

  • The SCSI devices.

  • The SCSI controllers and the hard disks so that they work properly on a shared SCSI bus.

  • Properly terminating the bus. The shared SCSI bus must have a terminator at each end of the bus. It is possible to have multiple shared SCSI buses between the nodes of a cluster.

In addition to the information on the following pages, refer to documentation from the manufacturer of your SCSI device or to the SCSI specifications, which can be ordered from the American National Standards Institute (ANSI). The ANSI Web site includes a catalog that can be searched for the SCSI specifications.

Configuring the SCSI Devices

Each device on the shared SCSI bus must have a unique SCSI identification number. Because most SCSI controllers default to SCSI ID 7, configuring the shared SCSI bus includes changing the SCSI ID number on one controller to a different number, such as SCSI ID 6. If there is more than one disk that will be on the shared SCSI bus, each disk must have a unique SCSI ID number.

Terminating the Shared SCSI Bus

There are several methods for terminating the shared SCSI bus. They include:

  • SCSI controllers

    SCSI controllers have internal soft termination that can be used to terminate the bus, however this method is not recommended with the cluster server. If a node is turned off with this configuration, the SCSI bus will be terminated improperly and will not operate correctly.

  • Storage enclosures

    Storage enclosures also have internal termination, which can be used to terminate the SCSI bus if the enclosure is at the end of the SCSI bus. This should be turned off.

  • Y cables

    Y cables can be connected to devices if the device is at the end of the SCSI bus. An external active terminator can then be attached to one branch of the Y cable in order to terminate the SCSI bus. This method of termination requires either disabling or removing any internal terminators that the device may have.

    Figure 27 outlines how a SCSI cluster should be physically connected.

    4a1ed85e-9da3-477c-b16b-45f57e66849aFigure 27. A diagram of a SCSI cluster hardware configuration.

Note

Any devices that are not at the end of the shared bus must have their internal termination disabled. Y cables and active terminator connectors are the recommended termination methods because they will provide termination even when a node is not online.

Storage Area Network Considerations

There are two supported methods of Fibre Channel-based storage in a Windows Server 2003 server cluster: arbitrated loops and switched fabric.

Important

When evaluating both types of Fibre Channel implementation, read the vendor’s documentation and be sure you understand the specific features and restrictions of each.

Although the term Fibre Channel implies the use of fiber-optic technology, copper coaxial cable is also allowed for interconnects.

Arbitrated Loops (FC-AL)

A Fibre Channel arbitrated loop (FC-AL) is a set of nodes and devices connected into a single loop. FC-AL provides a cost-effective way to connect up to 126 devices into a single network. As with SCSI, a maximum of two nodes is supported in an FC-AL server cluster configured with a hub. An FC-AL is illustrated in Figure 28.

Figure 28   FC-AL Connection

1e723207-e776-4379-88fc-052520ce0299

FC-ALs provide a solution for two nodes and a small number of devices in relatively static configurations. All devices on the loop share the media, and any packet traveling from one device to another must pass through all intermediate devices.

If your high-availability needs can be met with a two-node server cluster, an FC-AL deployment has several advantages:

  • The cost is relatively low.

  • Loops can be expanded to add storage (although nodes cannot be added).

  • Loops are easy for Fibre Channel vendors to develop.

The disadvantage is that loops can be difficult to deploy in an organization. Because every device on the loop shares the media, overall bandwidth in the cluster is lowered. Some organizations might also be unduly restricted by the 126-device limit.

Switched Fabric (FC-SW)

For any cluster larger than two nodes, a switched Fibre Channel switched fabric (FC-SW) is the only supported storage technology. In an FC-SW, devices are connected in a many-to-many topology using Fibre Channel switches (illustrated in Figure 29).

Figure 29   FC-SW Connection

5d32c2be-32c9-41bb-83d4-0bbf6831fbcf

When a node or device communicates with another node or device in an FC-SW, the source and target set up a point-to-point connection (similar to a virtual circuit) and communicate directly with each other. The fabric itself routes data from the source to the target. In an FC-SW, the media is not shared. Any device can communicate with any other device, and communication occurs at full bus speed. This is a fully scalable enterprise solution and, as such, is highly recommended for deployment with server clusters.

FC-SW is the primary technology employed in SANs. Other advantages of FC-SW include ease of deployment, the ability to support millions of devices, and switches that provide fault isolation and rerouting. Also, there is no shared media as there is in FC-AL, allowing for faster communication. However, be aware that FC-SWs can be difficult for vendors to develop, and the switches can be expensive. Vendors also have to account for interoperability issues between components from different vendors or manufacturers.

Using SANs with Server Clusters

For any large-scale cluster deployment, it is recommended that you use a SAN for data storage. Smaller SCSI and stand-alone Fibre Channel storage devices work with server clusters, but SANs provide superior fault tolerance.

A SAN is a set of interconnected devices (such as disks and tapes) and servers that are connected to a common communication and data transfer infrastructure (FC-SW, in the case of Windows Server 2003 clusters). A SAN allows multiple server access to a pool of storage in which any server can potentially access any storage unit.

The information in this section provides an overview of using SAN technology with your Windows Server 2003 clusters. For additional information about deploying server clusters on SANs, see the Windows Clustering: Storage Area Networks link on the Web Resources page at https://www.microsoft.com/windows/reskits/webresources.

Note

Vendors that provide SAN fabric components and software management tools have a wide range of tools for setting up, configuring, monitoring, and managing the SAN fabric. Contact your SAN vendor for details about your particular SAN solution.

SCSI Resets

Earlier versions of Windows server clusters presumed that all communications to the shared disk should be treated as an isolated SCSI bus. This behavior may be somewhat disruptive, and it does not take advantage of the more advanced features of Fibre Channel to both improve arbitration performance and reduce disruption.

One key enhancement in Windows Server 2003 is that the Cluster service issues a command to break a RESERVATION, and the StorPort driver can do a targeted or device reset for disks that are on a Fibre Channel topology. In Windows 2000 server clusters, an entire bus-wide SCSI RESET is issued. This causes all devices on the bus to be disconnected. When a SCSI RESET is issued, a lot of time is spent resetting devices that may not need to be reset, such as disks that the CHALLENGER node may already own.

Resets in Windows 2003 occur in the following order:

  1. Targeted logical unit number (LUN)

  2. Targeted SCSI ID

  3. Entire bus-wide SCSI RESET

Note

Targeted resets require functionality in the host bus adapter (HBA) drivers. The driver must be written for StorPort and not SCSIPort. Drivers that use SCSIPort will use the Challenge and Defense the same as it is currently in Windows 2000. Contact the manufacturer of the HBA to determine if it supports StorPort.

SCSI Commands

The Cluster service uses the following SCSI commands:

  • SCSI reserve: This command is issued by a host bus adapter or controller to  maintain ownership of a SCSI device. A device that is reserved refuses all commands from all other host bus adapters except the one that initially reserved it, the initiator. If a bus-wide SCSI reset command is issued, loss of reservation occurs.

  • SCSI release: This command is issued by the owning host bus adapter; it frees a SCSI device for another host bus adapter to reserve.

  • SCSI reset: This command breaks the reservation on a target device. This command is sometimes referred to globally as a "bus reset."

The same control codes are used for Fibre Channel as well. These parameters are defined in this partner article:

309186 How the Cluster Service Takes Ownership of a Disk on the Shared Bus

317162 Supported Fibre Channel Configurations

The following sections provide an overview of SAN concepts that directly affect a server cluster deployment.

HBAs

Host bus adapters (HBAs) are the interface cards that connect a cluster node to a SAN, similar to the way that a network adapter connects a server to a typical Ethernet network. HBAs, however, are more difficult to configure than network adapters (unless the HBAs are preconfigured by the SAN vendor). All HBAs in all nodes should be identical and be at the same driver and firmware revision

Zoning and LUN Masking

Zoning and LUN masking are fundamental to SAN deployments, particularly as they relate to a Windows Server 2003 cluster deployment.

Zoning

Many devices and nodes can be attached to a SAN. With data stored in a single cloud, or storage entity, it is important to control which hosts have access to specific devices. Zoning allows administrators to partition devices in logical volumes and thereby reserve the devices in a volume for a server cluster. That means that all interactions between cluster nodes and devices in the logical storage volumes are isolated within the boundaries of the zone; other noncluster members of the SAN are not affected by cluster activity.

Figure 30 is a logical depiction of two SAN zones (Zone A and Zone B), each containing a storage controller (S1and S2, respectively).

Figure 30   Zoning

1e3b3d17-a6f1-4ede-b007-8fd7a7b8c997

In this implementation, Node A and Node B can access data from the storage controller S1, but Node C cannot. Node C can access data from storage controller S2.

Zoning needs to be implemented at the hardware level (with the controller or switch) and not through software. The primary reason is that zoning is also a security mechanism for a SAN-based cluster, because unauthorized servers cannot access devices inside the zone (access control is implemented by the switches in the fabric, so a host adapter cannot gain access to a device for which it has not been configured). With software zoning, the cluster would be left unsecured if the software component failed.

In addition to providing cluster security, zoning also limits the traffic flow within a given SAN environment. Traffic between ports is routed only to segments of the fabric that are in the same zone.

LUN Masking

A LUN is a logical disk defined within a SAN. Server clusters see LUNs and think they are physical disks. LUN masking, performed at the controller level, allows you to define relationships between LUNs and cluster nodes. Storage controllers usually provide the means for creating LUN-level access controls that allow access to a given LUN to one or more hosts. By providing this access control at the storage controller, the controller itself can enforce access policies to the devices.

LUN masking provides more specific security than zoning, because LUNs provide a means for zoning at the port level. For example, many SAN switches allow overlapping zones, which enable a storage controller to reside in multiple zones. Multiple clusters in multiple zones can share the data on those controllers. Figure 31 illustrates such a scenario.

Figure 31   Storage Controller in Multiple Zones

1da7a0fa-8a63-464a-98b1-890f8928d5ef

LUNs used by Cluster A can be masked, or hidden, from Cluster B so that only authorized users can access data on a shared storage controller.

Requirements for Deploying SANs with Windows Server 2003 Clusters

The following list highlights the deployment requirements you need to follow when using a SAN storage solution with your server cluster. For a white paper that provides more complete information about using SANs with server clusters, see the Windows Clustering: Storage Area Networks link on the Web Resources page at https://www.microsoft.com/windows/reskits/webresources.

Each cluster on a SAN must be deployed in its own zone. The mechanism the cluster uses to protect access to the disks can have an adverse effect on other clusters that are in the same zone. By using zoning to separate the cluster traffic from other cluster or noncluster traffic, there is no chance of interference.

All HBAs in a single cluster must be the same type and have the same firmware version. Many storage and switch vendors require that all HBAs on the same zone—and, in some cases, the same fabric—share these characteristics.

All storage device drivers and HBA device drivers in a cluster must have the same software version.

Never allow multiple nodes access to the same storage devices unless they are in the same cluster.

Never put tape devices into the same zone as cluster disk storage devices. A tape device could misinterpret a bus rest and rewind at inappropriate times, such as during a large backup.

Guidelines for Deploying SANs with Windows Server 2003 Server Clusters

In addition to the SAN requirements discussed in the previous section, the following practices are highly recommended for server cluster deployment:

In a highly available storage fabric, you need to deploy clustered servers with multiple HBAs. In these cases, always load the multipath driver software. If the I/O subsystem sees two HBAs, it assumes they are different buses and enumerates all the devices as though they were different devices on each bus. The host, meanwhile, is seeing multiple paths to the same disks. Failure to load the multipath driver will disable the second device because the operating system sees what it thinks are two independent disks with the same signature.

Do not expose a hardware snapshot of a clustered disk back to a node in the same cluster. Hardware snapshots must go to a server outside the server cluster. Many controllers provide snapshots at the controller level that can be exposed to the cluster as a completely separate LUN. Cluster performance is degraded when multiple devices have the same signature. If the snapshot is exposed back to the node with the original disk online, the I/O subsystem attempts to rewrite the signature. However, if the snapshot is exposed to another node in the cluster, the Cluster service does not recognize it as a different disk and the result could be data corruption. Although this is not specifically a SAN issue, the controllers that provide this functionality are typically deployed in a SAN environment.

For additional information, see the following articles in the Microsoft Knowledge Base:

301647 Cluster Service Improvements for Storage Area Networks

304415 Support for Multiple Clusters Attached to the Same SAN Device

280743 Windows Clustering and Geographically Separate Sites