Troubleshooting Quorum Resource Problems

Applies To: Windows Server 2003, Windows Server 2003 R2, Windows Server 2003 with SP1, Windows Server 2003 with SP2

This section provides troubleshooting information about a server cluster for which all of the following are true:

  • The cluster uses a single quorum device (storage that is attached to all nodes of the cluster). In other words, the cluster is not a single node cluster and is not a majority node set cluster.

  • The Cluster service will not start on any node.

    Note that in some situations, the Cluster service might start despite a problem with the quorum resource, and you can use this guide for troubleshooting in those situations. For example, a cluster in which the latest service pack has been applied might have repeated instances of Event ID 1147 related to the quorum, yet the Cluster service might be able to start on the nodes.

  • The problem seems as if it might be related to the quorum resource. For more information, see Distinguishing Quorum Problems from Other Problems.

If you already have some understanding of the problem with the quorum resource, use the following topics to begin troubleshooting. If not, use the flowchart and other information in this topic to begin to determine which of the following topics apply to your situation.

Sections in This Topic

This topic provides the following sections to help you begin troubleshooting:

  • Distinguishing Quorum Problems from Other Problems

  • Flowchart for Troubleshooting the Quorum Resource

  • Understanding the Fixquorum Option for Starting the Cluster Service

Distinguishing Quorum Problems from Other Problems

It can be challenging to analyze a problem with starting the Cluster service and determine whether it is related in some way to the quorum resource. To try to distinguish different types of problems, review the following table:

Area Explanation

Cluster service account: On the nodes on which you are trying to start the Cluster service, are there problems with the Cluster service account?

Also, if you have Windows Server 2003 with the latest service pack, does Event ID 7041 appear in the system event log? This event was added in Windows Server 2003 Service Pack 1 and indicates that the Cluster service account does not have all the necessary user rights, such as the Log on as a service user right.

If no domain controller is available to authenticate the Cluster service account, the Cluster service cannot start. Other problems with the account can also prevent the Cluster service from starting, for example, if the password was allowed to expire or the account does not have the necessary rights, possibly as a result of a Group Policy setting.

To check for this type of problem, try to log on to the computer with the Cluster service account, or check the system event log for messages that indicate that the Cluster service account cannot log on. Also, review the Cluster service account to make sure that it has the permissions and rights described in Change the Account Under Which the Cluster Service Runs at the Microsoft Web site.

Cluster log: Is the cluster log read-only? Alternatively, is a policy preventing the cluster log from being modified?

If the cluster log cannot be modified, the Cluster service cannot start.

By default, the cluster log is called Cluster.log and is located in systemroot\Cluster. The default name of this log can be changed by changing the system environment variable called ClusterLog.

Events in event log: Do event log messages appear to be consistent with a quorum resource problem?

Specific events can indicate that a problem with the quorum is preventing cluster startup. However, an event can indicate a symptom rather than a root cause, so all events should be interpreted in context.

See the next section for information about some events that can indicate problems with the quorum resource.

For more information about troubleshooting problems with starting the Cluster service, see article 266274, "How to Troubleshoot Cluster Service Startup Issues" in the Microsoft Knowledge Base.

If your startup issue appears to be related in some way to the quorum resource, see Flowchart for Troubleshooting the Quorum Resource.

Event Messages That are Consistent with Quorum Resource Problems

The following list describes some common event log messages that are consistent with quorum resource problems.

Important

If your event log contains any of the messages in the following list, be sure to check that the cables for the storage are not damaged or disconnected. Also, follow the other recommendations in Verifying Permissions, Hardware, and Software Before Troubleshooting the Quorum Resource.

Event Messages That Are Consistent with Quorum Resource Problems

  • 1034: The disk associated with cluster disk resource DriveLetter could not be found. The expected signature of the disk was DiskSignature.

    This error can result when the disk signature of the quorum disk has been inadvertently changed. This can happen when you make changes through a disk or storage utility, or when you recreate the LUN containing the quorum resource. Note that the cluster identifies the quorum resource and other disk resources by disk signature, not just drive letter.

    Follow the flowchart in this topic, and see if The Quorum Resource is on an Inaccessible or Nonfunctioning Disk applies to your situation. Consider whether Clusterrecovery.exe, which is described in Configuring a Computer for Troubleshooting the Quorum Resource in a Server Cluster, would be useful for your situation.

  • 1035: Cluster disk resource DriveLetter could not be mounted.

    Follow the flowchart in this topic, and see if The Quorum Resource is on an Inaccessible or Nonfunctioning Disk applies to your situation.

  • 1066: Cluster disk resource DriveLetter is corrupt. Run ChkDsk /F to repair problems.

    This error might indicate that the volume used for the quorum resource has a file system corruption problem, possibly a transient problem. Other events in the logs might help with diagnosis.

    Follow the flowchart in this topic, and see if The Quorum Resource is on an Inaccessible or Nonfunctioning Disk applies to your situation.

  • 1069: Cluster disk resource DriveLetter failed.

    This error might indicate that the disk used for the quorum resource has a problem. Other events in the logs might help with diagnosis.

    Follow the flowchart in this topic, and see if The Quorum Resource is on an Inaccessible or Nonfunctioning Disk applies to your situation.

  • 1147 or 1148:

    The Microsoft Clustering Service encountered a fatal error. The vital quorum log file 'q:\MSCS\quolog.log' could not be found. (Additional text is provided in this event.)

    If you have Windows Server 2003 with the latest service pack, you might see this event and yet find that the Cluster service started. When you apply the latest service pack, if the quorum log and cluster configuration files are missing from the quorum resource, the operating system can replace these files with information from a node.

    Follow the flowchart in this topic, and see if Files on the Cluster Quorum Might be Missing, Inaccessible, or Corrupt applies to your situation.

If you want to find out more details about a specific event that relates to the cluster, one source of information is the cluster log. The cluster log may be easiest to use when you already have a general sense of a problem and are looking for details. By default, the cluster log is called Cluster.log and is located in systemroot**\Cluster**. The log might have a different name, because the default name can be changed by changing the system environment variable called ClusterLog.

Flowchart for Troubleshooting the Quorum Resource

Use the following flowchart and the topics to which it refers to troubleshoot quorum resource problems. For information about the /fixquorum option mentioned in the flowchart, see Understanding the Fixquorum Option for Starting the Cluster Service, later in this topic.

Flow chart for troubleshooting the quorum

Understanding the Fixquorum Option for Starting the Cluster Service

The troubleshooting techniques shown in the flowchart in this topic use the /fixquorum option for starting the Cluster service. It is useful to understand the following details about the /fixquorum option:

  • When the /fixquorum option is used on a particular node, the Cluster service starts on that node, but the resources in the group that contains the quorum disk (usually the Cluster Group) remain offline. This allows you to attempt bringing the quorum resource online manually so that you can more easily diagnose problems with the quorum. Note that if this attempt to bring the quorum resource online fails, the Cluster service will stop again, even though it was started with the /fixquorum option.

  • Only one node at a time can be started with the /fixquorum option. You cannot join any other nodes to the node that was started with this option. When you have corrected the problem that you were having with the quorum resource, stop the Cluster service and restart it without options.

  • You can start the Cluster service with the /fixquorum option by opening a command prompt and typing:

    net start clussvc /fixquorum

    Or by typing:

    net start clussvc /fq

    You can also start the Cluster service with the /fixquorum option when using Services in Computer Management. To open Services, click Start, click Control Panel, double-click Administrative Tools, and then double-click Services. On a cluster node, the Cluster service will appear in the list of services. Right-click the service, click Properties, type /fixquorum in Start parameters, and then click Start.