Managing Compute Nodes

Applies To: Windows Compute Cluster Server 2003

Compute nodes are managed from the Node Management page of Compute Cluster Administrator.

The operational status of each compute node is monitored from the Node Management page of Compute Cluster Administrator. From this page, you can also perform cluster-specific tasks, called actions, on one or more nodes such as pausing and resuming node job execution, or approving nodes for inclusion in the cluster.

Compute Cluster Server 2003 prevents users from adding computers to a cluster by requiring that a cluster administrator approve the node for inclusion to the cluster. All new nodes are added to the cluster in a Pending Approval state and remain in that state until the cluster administrator approves and resumes them.

Node status

There are five status conditions that a node can be in: Configuring, PendingApproval, Paused, Ready, and Unreachable. Node status is defined in the following table.

Node Status Description

Configuring

This state indicates that Compute Cluster Pack setup is being run on the node.

PendingApproval

This is the default state after Setup is run successfully on a node. The cluster administrator must approve or reject the node for inclusion to the cluster. When approved, the node moves to the Paused state.

Paused

The Paused state allows a cluster administrator to run scripts, install software, and perform other tasks on the node. This is the default state of a node after a cluster administrator has approved the node for inclusion in the cluster. If a node is placed in a paused state while running jobs, those jobs will run to completion, but no new jobs will be scheduled for this node. Resuming a node places it in the Ready state.

Ready

The Ready state is the normal operating node state. A node with this status runs jobs and receives new jobs from the Job Scheduler.

Unreachable

The head node sends out one heartbeat per minute to check on the cluster nodes. If the head node has been unable to contact the node for three minutes, it places the node in the Unreachable state.

Node actions

Once added to your cluster, node status is managed from the Node Management page of Compute Cluster Administrator. From this page, you can perform several actions on one or more nodes by selecting the node(s) and then choosing an action from the Action pane, or by selecting one or more nodes, right-clicking and choosing an action. The actions that you can perform are:

  • Start a Remote Desktop Session

  • Pause one or more running nodes

  • Resume a paused node

  • Reboot selected nodes

  • Approve a node with PendingApproval status

  • Add or remove a node from the cluster

  • Identify nodes by causing the CD drive on each node to eject

  • Execute a command on a selected node

  • Run System Monitor on a selected node

  • Open Compute Cluster Job Manager on a selected node

The following table lists the actions that a cluster administrator can perform on a node, the required node status for the action to succeed, and when the action can be performed. Perform actions on nodes by right-clicking one or more nodes shown on the Node Management page of the Compute Cluster Administrator and then selecting the appropriate action. Alternatively, select one or more nodes then choose an action displayed on the Action pane.

Action Node Status When the action can be performed

Approve

PendingApproval

Node must be PendingApproval for the Approve action to succeed.

Nodes in PendingApproval state do not receive jobs.

Nodes that are pending approval do not receive updated user and administrator group information.

If the node is not pending approval, then the Approve action will not be accessible.

Pause

Ready

A running node can be paused at any time. All running jobs will continue to run after the node has been paused. However, no new jobs will be scheduled on a paused node If the node is not Ready, then the Pause action will be dimmed.

Resume

Paused

A paused node can be resumed at any time. After the Resume action is run node status becomes Ready.

After a node becomes Ready, Job Scheduler can begin scheduling new jobs.

If the node is not Paused, then the Resume action will be inaccessible.

Remove

Ready, Paused, Unreachable, Configuring, PendingApproval

This action runs the Remove Node wizard.

You can remove nodes as long as they are not running jobs. When a node is removed, entries for this node are removed from both Job Scheduler and the compute cluster management infrastructure.

Note that removing a node does not delete the node computer account from Active Directory, nor does it remove the Compute Cluster Pack software from the hard drive of the node.

Open Remote Desktop Connection

Ready, Paused, Configuring, Unreachable, PendingApproval

Can be run any time even when the node is Unreachable. This enables the cluster administrator to create a remote desktop connection to a node at any time to perform tasks such as checking for stopped services.

Identify node

Ready, Paused, Configuring, PendingApproval

This action cannot be run against Unreachable nodes

Restart

Paused, Configuring, PendingApproval

This action cannot be run against Unreachable nodes.

This action cannot be run against Ready nodes

Run a Command

Paused, Configuring, PendingApproval, Ready

This action cannot be run against Unreachable nodes.

Open System Monitor

Ready, Paused, PendingApproval

This action cannot be run against Unreachable nodes

Open Event Viewer

Ready, Paused, Configuring, PendingApproval

This action cannot be run against Unreachable nodes

Limitations for nodes pending for approval

A node that has a status of Pending Approval will not receive updated cluster user or administrator information.

Because the version of Remote Installation Services (RIS) included in Windows Server 2003, Compute Cluster Edition disables the local Administrator account on a compute node, you will not be able to log on to a node with this status. A domain user can log on to the node but cannot install software or patches, change settings, or perform any other activities which require administrator rights. A cluster administrator can perform these tasks only after a node has been approved for inclusion to the cluster.

Limitations on node actions when using Compute Cluster Administrator remotely

In two of the five network topologies supported by Windows Compute Cluster Server 2003, the compute nodes do not have public network interfaces. In those topologies (scenarios one and three) the compute nodes can be accessed only though the head node. In those scenarios, limitations exist regarding what actions cluster administrators can perform when running the Compute Cluster Administrator from a remote workstation.

When running a Compute Cluster Administrator remotely, the following actions can be performed only from the cluster head node:

  • Install or remove RIS from the cluster head node.

  • Add and remove RIS installation images from the cluster head node.

  • Use the Automated method of adding compute nodes.

  • Run System Monitor sessions on compute nodes to obtain performance information.

  • Create Remote Desktop Sessions to cluster compute nodes.

  • Reboot cluster compute nodes.

  • Identify cluster nodes.

All other administrative actions, such as approving or pausing a compute node, can be done remotely.

See Also

Concepts

Compute Cluster Administrator Operations