Best Practices for Applying Software Updates in Windows HPC Server 2008

Applies To: Windows HPC Server 2008

The compute nodes in your cluster can be regularly updated (patched) with Microsoft software updates that are available from the Microsoft Update Web site or from a Windows Server Update Services (WSUS) server in your enterprise. The following sections describe the main scenarios for regularly applying Microsoft software updates to the compute nodes, along with advantages, limitations, links to procedures, and best practices:

  • Scenario 1: Update compute nodes using an enterprise WSUS server

  • Scenario 2: Update compute nodes using node templates

  • Considerations for updating the head node

  • Additional scenarios to apply updates

Note

You can update compute nodes that are already imaged without reimaging them.

Note

Scenarios for updating broker nodes are generally similar to the scenarios for updating compute nodes.

Scenario 1: Update compute nodes using an enterprise WSUS Server

In an environment that already manages updates by using WSUS, and the compute nodes have network connections to the enterprise network or another WSUS server, you can use WSUS to fully manage the distribution of Microsoft software updates to the compute nodes. WSUS settings and the Group Policy settings configured in the Active Directory domain can determine both the Microsoft software updates that are applied to the compute nodes and the timing of the updates. This scenario uses the existing WSUS infrastructure but decouples update management from automatic maintenance tasks that are performed on compute nodes.

Note

It is also possible to use a WSUS server in the enterprise to help discover, filter, or store the updates that are required for the compute nodes, but to configure a node template to control when the updates are applied on the compute nodes. This approach uses certain features of WSUS but gives the cluster administrator control to ensure that updates do not interrupt running jobs. For more information, see Scenario 2: Update Compute Nodes Using Node Templates.

When is this scenario appropriate?

You may want to use an existing WSUS server to fully manage updates to the compute nodes in one or more of the following cases:

  • You have an existing enterprise WSUS server, Group Policy objects, and organizational policies that enforce automatic updates to computers in the Active Directory domain. This may be more common in larger enterprises.

  • The network administrator who manages the WSUS server and Active Directory infrastructure can configure appropriate update settings for the compute nodes.

  • You can schedule maintenance windows on the compute nodes to apply updates without interfering with scheduled jobs.

  • The compute nodes connect to the enterprise network (topology 2, 4, or 5).

Prerequisites

  • WSUS must be deployed on a server in the network environment that connects to the compute nodes.

  • Group Policy objects must be configured with Windows Update settings and linked to the domain or to the appropriate organizational unit (OU) or other Active Directory container for the compute nodes. For more information, see Configure Clients Using Group Policy (https://go.microsoft.com/fwlink/?LinkId=191726).

Advantages

  • Updates to the compute nodes are fully managed using the existing WSUS and Active Directory infrastructure.

  • Any update that is managed in WSUS, including an update with software license terms (also known as an End User License Agreement or EULA), can be approved and deployed to the compute nodes.

  • WSUS provides robust tools for managing, monitoring, and reporting updates.

  • In HPC Cluster Manager, you can run the Pending Software Updates diagnostic and the Installed Software Updates Report for additional reporting on cluster updates.

  • You can flexibly configure WSUS to deploy updates to the compute nodes.

Limitations

  • Unless you use a node template to control when updates are applied on the compute nodes, the updates are not automatically coordinated with jobs that are running. If a job is running on a compute node and the node must restart to apply an update, the job is interrupted. For more information about using node templates to control when updates are applied, see Scenario 2: Update Compute Nodes Using Node Templates.

  • The cluster updates are not administered by the cluster administrator, and the update settings are not visible in HPC Cluster Manager.

  • If the compute nodes do not connect directly to the enterprise network, but need to download updates from a WSUS server on the enterprise network, the head node may experience decreased performance when multiple compute nodes are updated. For example, if NAT is enabled on the head node, all network traffic to download updates is routed through the head node.

Steps

For steps to deploy, configure, and operate a WSUS server, see the following:

Note

In Windows Server 2008 SP2 and Windows Server 2008 R2, you can install the Windows Server Update Services Role Service by using Server Manager.

Best practices

  • The WSUS server role should not be deployed on the head node. Deploying the WSUS server role on the head node can adversely impact the performance of the head node, especially in larger clusters.

  • If you use WSUS to fully manage the distribution of updates, you do not need to include an Apply Updates maintenance task in the node template for the compute nodes.

  • Consult the WSUS documentation for additional best practices. For more information, see Best Practices with Windows Server Update Services 3.0 SP2 (https://go.microsoft.com/fwlink/?LinkId=191727).

Scenario 2: Update compute nodes using node templates

Windows HPC Server 2008 integrates with Microsoft Update services so that you can manage updates to the cluster from HPC Cluster Manager without interrupting your jobs. To do this, you configure node templates that apply updates to the compute nodes on your cluster.

The Maintenance phase of a node template can include an Apply Updates task, with settings you configure for which updates to apply. To run maintenance tasks on compute nodes, you take the nodes offline and run a Maintain action. If the node template for the nodes includes the Apply Updates task, this task downloads updates to the compute nodes from the Microsoft Update Web site or the WSUS server in your enterprise, and then installs the updates. When you bring the nodes online, they can run jobs or accept new jobs.

Note

You should use only the Apply Updates task in the node template to update the nodes. Do not use a Post Install Command maintenance task to run a command to install an update.

When is this scenario appropriate?

Using a node template to apply updates to the compute nodes is appropriate in one or more of the following cases:

  • It is not practical to manage updates to the compute nodes by using an existing enterprise WSUS server, or WSUS is not used in the enterprise.

  • A WSUS server is used, but it is necessary to further control either the updates that are applied to the compute nodes or the timing of the updates.

  • The compute nodes are isolated on a private network.

Prerequisites

  • If a node template will apply updates from the Microsoft Update site via the Internet, automatic updating (a Windows Update setting) must be turned off on all of the nodes. When you deploy compute nodes, by default, automatic updating is turned off.

  • If a node template will apply updates from a WSUS server, the WSUS server must be deployed and configured with appropriate Group Policy settings to download and install updates to the computers in the domain or to appropriately defined groups of computers. In the Group Policy settings for Windows Update on the compute nodes, enable the Configure Automatic Updates setting, and select the Notify for download and notify for install option.

Advantages

  • Applying updates does not interrupt your cluster jobs. When you take nodes offline to run maintenance tasks, the nodes can first complete currently running jobs.

  • Updates to the compute nodes can be managed, monitored, and verified entirely through HPC Cluster Manager, by the cluster administrator.

  • The following diagnostics are available to report on the status of Microsoft updates on the nodes: Pending Software Updates, Software Updates Required, and Installed Software Updates Report.

  • A compute node can be automatically updated as a step when it is provisioned (if the node template already contains the Apply Updates task).

Limitations

  • The Apply Updates task in the node template provides only coarse filtering options for the updates that are applied to the nodes. However, if a WSUS server is used, WSUS can be configured to filter the updates that are applied more precisely.

  • The Apply Updates task in the node template cannot install updates that include software license terms (also known as an End User License Agreement or EULA). This type of update requires the administrator of each node to accept the software license terms before the update can be installed. In this scenario, you have to install updates that include software license terms manually.

  • Depending on the HPC cluster size and network topology, the head node may experience decreased performance when multiple compute nodes attempt to download updates from Microsoft Update or from a WSUS server. For example, if NAT is enabled on the head node, all network traffic to download updates is routed through the head node.

Steps

See Steps for Updating Compute Nodes Using Node Templates in Windows HPC Server 2008 in this guide.

Best practices

  • The cluster administrator should regularly schedule maintenance on the compute nodes to ensure that critical and security updates are applied in a timely way.

  • The head node should be updated before you update the compute nodes. For more information, see Considerations for updating the head node.

  • You can improve the way the compute nodes use network bandwidth to download updates. For example, you can configure bandwidth limitations for the Background Intelligent Transfer Service (BITS), which is the service on each computer that downloads Microsoft updates. You can also enable peer caching to minimize downloads of updates and to maximize the sharing of downloads among the compute nodes in the cluster subnet. For more information, see Configuring BITS 2.0 and 3.0 for Download Performance (https://go.microsoft.com/fwlink/?LinkId=191728).

  • It is not required to use a WSUS server in this scenario. However, in clusters with more than 30 compute nodes, the WSUS capabilities to discover, filter, and store the updates can help streamline the cluster updates.

  • If you use a WSUS server in this scenario, note the following:

    • You should manage update classifications only in WSUS. You should not filter the updates in the settings of the Apply Updates task. To do this, in the properties of the Apply Updates task, set Categories to All.

    • The WSUS server role should not be deployed on the head node. Deploying the WSUS server role on the head node can adversely impact the performance of the head node, especially in larger clusters.

    • If the head node is configured to perform network address translation (NAT) for the compute nodes in the private network, the WSUS server role should be deployed on an infrastructure server in the private network. The WSUS server must connect to the Internet or to an upstream enterprise WSUS server. For more information, see Choose a Type of WSUS Deployment (https://go.microsoft.com/fwlink/?LinkId=191729).

Considerations for updating the head node

You should plan and apply updates to the head node following practices in your organization for updating other infrastructure servers, such as a server running Microsoft Exchange Server or Microsoft SQL Server®.

As a best practice, you should update the head node before you update the compute nodes, and you should plan to update the head node during a time when you schedule other maintenance tasks for the cluster. Do not update the head node during a critical time or while a long-running job is still running.

Note

If the head node has a compute node role or a broker node role, you should plan to update the rest of the compute nodes and broker nodes in the cluster as soon as possible after you update the head node.

Before you update the head node, the following steps are recommended:

  1. Close all instances of HPC Cluster Manager.

  2. Stop the following services that are running on the head node: HPC Job Scheduler Service, HPC Management Service, HPC MPI Service, HPC Node Manager Service, and HPC SDM Store Service.

If you have set up a head node that has failover clustering configured, you should apply updates to both nodes.

Additional scenarios to apply updates

A number of additional capabilities exist in Windows HPC Server 2008 to apply either Microsoft or non-Microsoft software updates to the compute nodes in your cluster network. Depending on the type of updates, the size of your cluster, administrator resources, and your network topology, one or more of the following scenarios may be considered:

  • Manually update individual compute nodes

    This may be appropriate in a small, isolated cluster where the cluster administrator can connect to each compute node to download and install the updates.

  • Use the clusrun command to run an executable file on the compute nodes that installs an update

    For more information, see Running clusrun Commands (https://go.microsoft.com/fwlink/?LinkId=191755).

    Warning

    You cannot use clusrun to install a service pack for Windows HPC Server 2008 because such an update stops the services for clusrun itself.

  • Update on offline Windows image using the Deployment Image Servicing and Management (DISM) command line tool

    You can use DISM commands to install, remove, or update Windows packages in an existing operating system image that is then used for deployment. This scenario may be useful when you choose to reimage your compute nodes but want to make sure a defined set of updates is installed. For more information about using DISM, see Deployment Image Servicing and Management Technical Reference (https://go.microsoft.com/fwlink/?LinkId=143441).

  • Capture a “master” compute node image that is then deployed to the rest of the compute nodes using a customized node template

    This scenario may be useful when you need to install multiple applications, drivers, and updates on the compute nodes. One of the compute nodes is updated to become a master node. An image of the node is captured by using sysprep.exe and imagex.exe. The image can be deployed to the compute nodes by using a node template that includes the captured image. For more information, see How to Capture a “Master” Compute Node Image Using Node Templates (https://go.microsoft.com/fwlink/?LinkId=191730).