Step 4: Add Compute Nodes to the Cluster

Applies To: Windows HPC Server 2008

Windows HPC Server 2008 simplifies the deployment process of compute nodes by providing automatic node imaging, automatic naming of nodes, and other capabilities to streamline deployment tasks. Also, it provides tools that you can use to monitor the progress of your deployment.

Important

Unlike previous versions of Windows HPC Server 2008, the default in Windows HPC Server 2008 is to respond only to Pre-Boot Execution (PXE) requests that come from existing compute nodes. This default setting is automatically changed when you use the Add Node Wizard to add nodes from bare metal. Also, you can manually change this setting in the Options menu, under Deployment Settings.

After creating a node template, you can use the Add Node Wizard to add compute nodes to your HPC cluster. There are three ways by which you can add compute nodes to your cluster:

  • Deploy compute nodes from bare metal

  • Add compute nodes by importing a node XML file

  • Add preconfigured compute nodes

For more information about each of these three node deployment options, see “1.2. Decide how to add compute nodes to your cluster” in Step 1: Prepare for Your Deployment.

Important

Ensure that nothing can restart or shut down the head node during the deployment process of compute nodes, or the deployment might fail. For example, temporarily disable automatic updates on the head node.

In this section:

  • 4.1. Deploy Compute Nodes from Bare Metal

  • 4.2. Add Compute Nodes by Importing a Node XML File

  • 4.3. Add Preconfigured Compute Nodes

  • 4.4. Monitor deployment progress

  • 4.5 Cancel the deployment of a node

4.1. Deploy compute nodes from bare metal

The following procedure describes how to add compute nodes to your HPC cluster from bare metal, by using a node template that includes a step to deploy an operating system image.

Important

To complete this procedure, you must have a template that includes a step to deploy an operating system image. If you do not have a template that includes a step to deploy an operating system image, create one by following the steps in “3.4. Create a node template”, in Step 3: Configure the Head Node.

Important

Before turning on a compute node for this procedure, verify in the configuration of the BIOS of that computer that the compute node will boot from the network adapter that is connected to the private network, instead of booting from the local hard drive or another device, and that Pre-boot Execution Environment (PXE) boot is enabled for that network adapter.

To deploy compute nodes from bare metal

  1. If HPC Cluster Manager is not already open on the head node, open it. Click Start, point to All Programs, click Microsoft HPC Pack, and then click HPC Cluster Manager.

  2. In Node Management, in the Actions pane, click Add Node. The Add Node Wizard appears.

  3. On the Select Deployment Method page, click Deploy compute nodes from bare metal using an operating system image, and then click Next.

  4. On the Select New Nodes page, in the Node template list, click the name of a node template that includes a step to deploy an operating system image.

  5. Turn on the computers that you want to add as compute nodes to your cluster. Computers will be listed on the Add Node Wizard as they contact the head node during PXE boot. They will be named using the naming series that you specified when you configured the head node. For more information, see “3.3. Configure the naming of new nodes” in Step 3: Configure the Head Node.

  6. When all computers that you have turned on are listed, click Select all, and then click Deploy. If you see a node that you do not want to deploy at this time, you can unselect it.

  7. On the Completing the Add Node Wizard page, if you will be deploying more nodes, click Continue responding to all PXE requests. If you will not be deploying more nodes, click Respond only to PXE requests that come from existing compute nodes.

  8. To monitor deployment progress, select the Go to Node Management to track progress check box, and then click Finish. For more information, see 4.4. Monitor deployment progress.

4.2. Add compute nodes by importing a node XML file

The following procedure describes how to add compute nodes by importing a node XML file.

Important

To complete this procedure, you must have a valid node XML file. For more information, see Appendix 2: Creating a Node XML File.

To add compute nodes by importing a node XML file

  1. If HPC Cluster Manager is not already open on the head node, open it. Click Start, point to All Programs, click Microsoft HPC Pack, and then click HPC Cluster Manager.

  2. In Node Management, in the Actions pane, click Add Node. The Add Node Wizard appears.

  3. On the Select Deployment Method page, click Import compute nodes from a node XML file, and then click Next.

  4. On the Select Node XML File page, type or browse to the location of the node XML file, and then click Import.

  5. To monitor deployment progress, on the Completing the Add Node Wizard page, select the Go to Node Management to track progress check box, and then click Finish. For more information, see 4.4. Monitor deployment progress.

4.3. Add preconfigured compute nodes

A preconfigured compute node is a computer that has HPC Pack 2008 already installed and that is connected to the HPC cluster network according to the network topology that you have chosen for your cluster. After HPC Pack 2008 is installed on all the compute nodes that you want to add to your cluster, you can use the Add Node Wizard on the head node to add the preconfigured nodes to your cluster.

The following procedures describe how to add preconfigured compute nodes to your HPC cluster. The first procedure describes how to install HPC Pack 2008 on the computers that will act as compute nodes, and the second procedure describes how to add the preconfigured compute nodes to the cluster.

Important

The computers that you will add to your cluster as preconfigured compute nodes must already be running Windows Server® 2008 HPC Edition, or another 64-bit edition of the Windows Server 2008 operating system.

Important

We strongly recommend that you perform a clean installation of Windows Server 2008 before installing HPC Pack 2008. If you want to install HPC Pack 2008 on an existing installation of Windows Server 2008, remove all server roles first and then follow the procedures in this guide.

Important

To complete this procedure, you must have a node template that does not include a step to deploy an operating system image. If you do not have a node template that does not include a step to deploy an operating system image, create one by following the steps in “3.4. Create a node template”, in Step 3: Configure the Head Node.

To install HPC Pack 2008 on a compute node computer

  1. To start the HPC Pack 2008 installation wizard on the computer that will act as a compute node, run setup.exe from the HPC Pack 2008 installation media or from a network location.

  2. On the Getting Started page, click Next.

  3. On the Microsoft Software License Terms page, read or print the software license terms in the license agreement, and accept or reject the terms of that agreement. If you accept the terms, click Next.

  4. On the Select Installation Type page, click Join an existing HPC cluster by creating a new compute node, and then click Next.

  5. On the Join Cluster page, type the computer name of the head node on your cluster, and then click Next.

  6. On the Select Installation Location page, click Next.

  7. On the Install Required Components page, click Install.

  8. On the Installation Complete page, click Close.

After HPC Pack 2008 is installed on all the compute nodes that you want to add to your cluster, follow the steps in the Add Node Wizard on the head node to add the preconfigured nodes to your cluster.

To add preconfigured compute nodes to your cluster

  1. If HPC Cluster Manager is not already open on the head node, open it. Click Start, point to All Programs, click Microsoft HPC Pack, and then click HPC Cluster Manager.

  2. In Node Management, in the Actions pane, click Add Node. The Add Node Wizard appears.

  3. On the Select Deployment Method page, click Add compute nodes that have already been configured, and then click Next.

  4. Turn on all the preconfigured nodes that you want to add to your cluster.

  5. After all the preconfigured nodes are turned on, on the Before Deploying page, click Next.

  6. On the Select New Nodes page, in the Node template list, click the name of a node template that does not include a step to deploy an operating system image.

  7. Select the preconfigured compute nodes that you want to add to your cluster. To select all the preconfigured compute nodes, click Select all.

  8. To add the selected compute nodes to your cluster, click Add.

  9. On the Completing the Add Node Wizard page, if you will be deploying more nodes, click Continue responding to all PXE requests. If you will not be deploying more nodes, click Respond only to PXE requests that come from existing compute nodes.

  10. To monitor deployment progress, select the Go to Node Management to track progress check box, and then click Finish. For more information, see 4.4. Monitor deployment progress.

4.4. Monitor deployment progress

During the deployment process of a compute node, its state is set to Provisioning. After the deployment process is complete, the state changes to Offline. You must bring compute nodes online before they can process jobs.

You can monitor the progress of the deployment process of compute nodes in Node Management, and bring online nodes that have finished deploying. You can also see detailed information for each deployment operation, and any errors that may have occurred.

To monitor deployment progress

  1. If HPC Cluster Manager is not already open on the head node, open it. Click Start, point to All Programs, click Microsoft HPC Pack, and then click HPC Cluster Manager.

  2. To view information about the deployment operations:

    1. In Node Management, in the Navigation Pane, click Operations.

    2. To view more information about a specific operation, click that operation. The Detail Pane will list the log entries for that operation.

  3. To view the list of compute nodes that are currently being deployed:

    1. In Node Management, in the Navigation Pane, under Nodes, under By State, click Provisioning.

    2. To view the list of operations related to the deployment of a specific node, double-click that node, and then click the Operations tab.

  4. To bring online the nodes that have finished deploying:

    1. In Node Management, in the Navigation Pane, under Nodes, under By State, click Offline.

    2. Select all the nodes that you want to bring online. To select all nodes that are currently offline, on the list of offline nodes, click any node and then press CTRL+A.

    3. In the Actions pane, click Bring Online.

  5. If the deployment process of a compute node fails, the state of that node is set to Unknown and the health is set to Provisioning Failed. To determine the reason of the failure, review the provisioning log for that node and the list of operations that were performed:

    1. In Node Management, in the Navigation Pane, under Nodes, under By Health, click Provisioning Failed.

    2. To review the provisioning log for a node, in the views pane, click the node, and then in the Detail Pane, click the Provisioning Log tab

    3. To view the list of operations related to the deployment failure, on the Properties tab, click View operations. The pivoted view in Node Management will list all the operations related to that node.

    4. To view more information about a specific operation, click that operation. The Detail Pane will list the log entries for that operation.

4.5 Cancel the deployment of a node

You can stop the deployment process of a compute node from HPC Cluster Manager by canceling the provisioning operations.

To cancel the deployment of a node

  1. To view only compute nodes that are currently being deployed, in Node Management, in the Navigation Pane, under Nodes, under By State, click Provisioning.

  2. In the views pane, click the node that you want to stop deploying.

  3. To cancel the provisioning operations, in the Detail Pane, on the Properties tab, click Cancel Operations. The deployment process will stop, the node will be moved to the Unknown state, and the health for that node will be changed to Provisioning Failed.