Share via


Understanding Cluster Validation Tests: Storage

Applies To: Windows Server 2008

Storage tests analyze the storage to determine whether it will work correctly for a failover cluster running Windows Server 2008.

Correcting issues uncovered by storage tests

If a storage test indicates that your storage or your storage configuration will not support a failover cluster, review the following suggestions:

  • Contact your storage vendor and use the utilities provided with your cluster storage to gather information about the configuration. (In unusual cases, your storage vendor might indicate that your cluster solution is supported even though this is not reflected in the storage tests. For example, your cluster solution might have been specifically designed to work without shared storage.)

  • Review results from multiple tests in the Validate a Configuration Wizard, such as the List Host Bus Adapters test (see the topic Understanding Cluster Validation Tests: Inventory) and two tests that are described in this topic, List All Disks and List Potential Cluster Disks.

  • Look for a storage validation test that is related to the one that uncovered the issue. For example, if Validate Multiple Arbitration uncovered an issue, the related test, Validate Disk Arbitration, might provide useful information.

  • Review the storage requirements in Understanding Requirements for Failover Clusters.

    For information about hardware compatibility for Windows Server 2008, see https://go.microsoft.com/fwlink/?LinkID=59821.

  • Review the documentation for your storage, or contact the manufacturer.

Storage tests in the Validate a Configuration Wizard

You can run the following storage tests by using the Validate a Configuration Wizard:

  • List All Disks

  • List Potential Cluster Disks

  • Validate Disk Access Latency

  • Validate Disk Arbitration

  • Validate Disk Failover

  • Validate File System

  • Validate Microsoft MPIO-Based Disks

  • Validate Multiple Arbitration

  • Validate SCSI Device Vital Product Data (VPD)

  • Validate SCSI-3 Persistent Reservation

  • Validate Simultaneous Failover

List All Disks

This test lists all disks that are visible to one or more tested servers. The test lists:

  • Disks that can support clustering and can be accessed by all the servers.

  • Disks on an individual server.

The following information is listed for each disk:

  • Disk number

  • Unique identifier

  • Bus type

  • Stack type

  • Disk address (where applicable), including the port, path, target identifier (TID), and Logical Unit Number (LUN)

  • Adapter description

  • Disk characteristics such as the partition style and partition type

You can use this test to help diagnose issues uncovered by other storage tests described in this topic.

List Potential Cluster Disks

This test lists disks that can support clustering and are visible to all tested servers. To support clustering, the disk must be connected through Serial Attached SCSI (SAS), iSCSI, or Fibre Channel. In addition, the test validates that multipath I/O is working correctly, meaning that each of the disks is seen as one disk, not two.

Types of disks not listed by the test

This test lists only disks that can be used for clustering. The disks that it lists must:

  • Be connected through Serial Attached SCSI (SAS), iSCSI, or Fibre Channel.

  • Be visible to all servers in the cluster.

  • Be accessed through a host bus adapter that supports clustering.

  • Not be a boot volume or system volume.

  • Not be used for paging files, hibernation, or dump files. (Dump files record the contents of memory when the system stops unexpectedly.)

Validate Disk Access Latency

This test validates that the latency for disk read and write operations is within an acceptable limit for a failover cluster. If disk read and write operations take too long, one possible result is that cluster time-outs might be triggered. Another possible result is that the application attempting to access the disk might appear to have failed, and the cluster might initiate a needless failover.

Validate Disk Arbitration

This test validates that:

  • Each of the clustered servers can use the arbitration process to become the owner of each of the cluster disks.

  • When a particular server owns a disk, if one or more other servers arbitrate for that disk, the original owner retains ownership.

If a clustered server cannot become the owner of a disk, or it cannot retain ownership when other clustered servers arbitrate for the disk, various issues might result:

  • The disk could have no owner and therefore be unavailable.

  • The disk could have two owners at the same time and therefore become corrupted.

    Failover cluster servers are designed to operate in circumstances where only one server owns a disk at a time. If multiple servers could own a disk at the same time, they might perform write operations in an uncoordinated way, possibly corrupting the disk.

  • The disk could change owners every time arbitration occurs, which would interfere with disk availability.

Validate Disk Failover

This test validates that disk failover works correctly in the cluster. Specifically, the test validates that when a disk owned by one clustered server is failed over, the server that takes ownership of the disk can read it. The test also validates that information written to the disk before the failover is still the same after the failover.

If disk failover occurs but the server that takes ownership of a disk cannot read it, the cluster cannot maintain availability of the disk. If information written to the disk is changed during the process of failover, it could cause issues for users or software that require this information. In either case, if the affected disk is a witness disk (a disk that stores cluster configuration data and participates in quorum), such issues could cause the cluster to lose quorum and shut down.

If this test reveals that disk failover does not work correctly, the results of the following tests might help you identify the cause of the issue:

  • Validate SCSI-3 Persistent Reservation

  • Validate Disk Arbitration

  • Validate Multiple Arbitration

Validate File System

This test validates that the file system on disks in shared storage is supported by failover clusters.

Validate Microsoft MPIO-Based Disks

This test validates that multi-path disks (Microsoft MPIO-based disks) have been configured correctly for failover cluster.

Validate Multiple Arbitration

This test validates that when multiple clustered servers arbitrate for a cluster disk, only one server obtains ownership.

If multiple clustered servers could obtain ownership of a cluster disk through disk arbitration, the disk might become corrupted. Failover clusters are designed to operate in circumstances where only one clustered server owns a disk at a time. If multiple servers could own a disk at the same time, they might perform write operations in an uncoordinated way, possibly corrupting the disk.

If this test reveals that multiple clustered servers can obtain ownership of a cluster disk through disk arbitration, the results of the following test might help you identify the cause of the issue:

  • Validate SCSI-3 Persistent Reservation

Validate SCSI Device Vital Product Data (VPD)

This test validates that the storage supports necessary SCSI inquiry data (VPD descriptors) and that they are unique.

Validate SCSI-3 Persistent Reservation

This test validates that the cluster storage uses the more recent (SCSI-3 standard) Persistent Reserve commands (which are different from the older SCSI-2 standard reserve/release commands). The Persistent Reserve commands avoid SCSI bus resets, which means they are much less disruptive than the older reserve/release commands. Therefore, a failover cluster can be more responsive in a variety of situations, as compared to a cluster running an earlier version of the operating system. In addition, disks are never left in an unprotected state, which lowers the risk of volume corruption.

Validate Simultaneous Failover

This test validates that simultaneous disk failovers work correctly in the cluster. Specifically, the test validates that even when multiple disk failovers occur at the same time, any clustered server that takes ownership of a disk can read it. The test also validates that information written to each disk before a failover is still the same after the failover.

If disk failover occurs but the server that takes ownership of a disk cannot read it, the cluster cannot maintain availability of the disk. If information written to the disk is changed during the process of failover, it could cause issues for users or software that require this information. In either case, if the affected disk is a witness disk (a disk that stores cluster configuration data and participates in quorum), such issues could cause the cluster to lose quorum and shut down.

If this test reveals that disk failover does not work correctly, the results of the following tests might help you identify the cause of the issue:

  • Validate Disk Failover

  • Validate Disk Arbitration

  • Validate Multiple Arbitration

Additional references