What's New in Windows HPC Server 2008 R2 Service Pack 2

Updated: August 2011

Applies To: Windows HPC Server 2008 R2

This document lists the new features and changes that are available in Service Pack 2 (SP2) for Microsoft® HPC Pack 2008 R2. For information about downloading and installing SP2, see the Release Notes for Microsoft HPC Pack 2008 R2 Service Pack 2.

In this topic:

  • Windows Azure integration

  • Job scheduling

  • Cluster management

  • Runtime and development

  • Additional resources

Windows Azure integration

The following features are new for integration with Windows Azure:

Note
Azure Virtual Machine roles and Azure Connect are pre-release features of Windows Azure. To use these features with Windows HPC, you must join the Azure Beta program. To apply for participation in the beta, log on to the Windows Azure Platform Portal, click Home, and then click Beta Programs. Acceptance in the program might take several days.
  • Add Azure VM Roles to the cluster. While SP1 introduced the ability to add Azure Worker nodes to the cluster, SP2 introduces the ability to add Azure Virtual Machine nodes. Azure VMs support a wider range of applications and runtimes than do Azure Worker nodes. For example, applications that require long-running or complicated installation, are large, have many dependencies, or require manual interaction in the installation might not be suitable for worker nodes. With Azure VM nodes, you can build a VHD that includes an operating system and installed applications, save the VHD to the cloud, and then use the VHD to deploy Azure VM nodes to the cluster.

  • Run MPI jobs on Azure Nodes. SP2 includes support for running MPI jobs on Azure Nodes. This gives you the ability to provision computing resources on demand for MPI jobs. The MPI features are installed on both worker and virtual machine Azure Nodes.

  • Run Excel workbook offloading jobs on Azure Nodes. While SP1 introduced the ability to run UDF offloading jobs on Azure Nodes, SP2 introduces the ability to run Excel workbook offloading jobs on Azure Nodes. This enables you to provision computing resources on demand for Excel jobs. The HPC Services for Excel features for UDF and workbook offloading are included in Azure Nodes that you deploy as virtual machine nodes. Workbook offloading is not supported on Azure Nodes that you deploy as worker nodes.

  • Automatically run configuration scripts on new Azure Nodes. In SP2, you can create a script that includes configuration commands that you want to run on new Azure Node instances. For example, you can include commands to create firewall exceptions for applications, set environment variables, create shared folders, or run installers. You upload the script to Azure Storage, and specify the name of the script in the Azure Node template. The script runs automatically as part of the provisioning process, both when you deploy a set of Azure Nodes and when a node is reprovisioned automatically by the Windows Azure system. If you want to configure a subset of the nodes in a deployment, you can create a custom node group to define the subset, and then use the %HPC_NODE_GROUPS% environment variable in your script to check for inclusion in the group before running the command. For more information, see Appendix 2: Configure a Startup Script for Azure Nodes.

  • Connect to Azure Nodes with Remote Desktop. In SP2, you can use Remote Desktop to help monitor and manage Azure Nodes that are added to the HPC cluster. As with on-premises nodes, you can select one or more nodes in HPC Cluster Manager and then click Remote Desktop in the Actions pane to initiate a connection with the nodes. This action is available by default with Azure VM roles, and can be enabled for Azure Worker Roles if Remote Access Credentials are supplied in the node template.

  • Enable Azure Connect on Azure Nodes. In SP2, you can enable Azure Connect with your Azure Nodes. With Azure Connect, you can enable connectivity between Azure Nodes and on-premises endpoints that have the Azure Connect agent installed. This can help provide access from Azure Nodes to UNC file shares and license servers on-premises.

  • New diagnostic tests for Azure Nodes. SP2 includes three new diagnostic tests in the Windows Azure test suite. The Windows Azure Firewall Ports Test verifies connectivity between the head node and Windows Azure. You can run this test before deploying Azure Nodes to ensure that any existing firewall is configured to allow deployment, scheduler, and broker communication between the head node and Windows Azure. The Windows Azure Services Connection Test verifies that the services running on the head node can connect to Windows Azure by using the subscription information and certificates that are specified in an Azure Node template. The Template test parameter lets you specify which node template to test. The Windows Azure MPI Communication Test runs a simple MPI ping-pong test between pairs of Azure Nodes to verify that MPI communication is working.

Job scheduling

The following features are new in job scheduling:

  • Guarantee availability of computing resources for different user groups. In SP2, you can configure the HPC Job Scheduler Service to allocate resources based on Resource Pools. Resource Pools help you define what proportion of your cluster cores must be guaranteed for specific user groups (or job types). If a user group is not using all of their guaranteed cores, those cores can be used by other groups. You must use job templates to associate a user group with a Resource Pool. Jobs that use the job template will collectively be guaranteed the proportion of cluster cores that are defined for the Resource Pool, and will be scheduled within the pool according to job priority, submit time, and scheduling mode (Queued or Balanced). Resource Pool scheduling works best on clusters with homogeneous resources. You can compare actual and guaranteed allocations for each resource pool with the Pool Usage report in Charts and Diagnostics.

  • Enable or require users to log on using soft card authentication when submitting jobs to the cluster. In SP2, you can enable soft card authentication on the cluster which will allow smart card users to run jobs. To set this up, you must work with your Certificate Authentication or PKI administrator to choose or create a certificate template that must be used when generating a soft card for the cluster. The certificate template must allow the private key to be exported, and can also have an associated access control list that defines who can use the certificate template. You can then specify the name of the template in the HpcSoftCardTemplate cluster property (set cluster properties by using cluscfg setparams or set-hpcClusterProperty). When users want to access the cluster, they can generate a soft card soft card credential that is based on this template by running hpccred createcert or New-HpcSoftcard. The HpcSoftCard cluster property is set to Disabled by default. If you want users to always use soft card authentication, set the property to Required. If you want users to choose between password or soft card log on, set the property to Allowed.

  • Submit jobs to the cluster from a web portal. In SP2, a cluster administrator can install the HPC Web Services Suite to set up a web portal that enables cluster users to submit and monitor jobs without installing the HPC Pack client utilities. A cluster administrator can create and customize Job Submission Pages in the portal that are based on existing job templates. Additionally, administrators can provide default values for application-specific command lines and parameters. Application command information can be defined and saved as an Application Profile and can then be associated with one or more job submission pages. When you launch the portal, it automatically includes one submission page that is based on the Default job template.

  • Use an HTTP web service to submit jobs across platforms or across doMayns. SP2 provides access to the HPC Job Scheduler Service using an HTTP web service that is based on the representational state transfer (REST) model. With a suitable client, users can define, submit, modify, list, view, requeue, and cancel jobs from other programming languages and operating systems. The full range of job description options are available through this service, including defining task dependencies. The service is included in the HPC Pack web features and can be installed using the HpcWebFeatures.msi. An example client is included in the SDK code samples for SP2.

  • Specify different submission or activation filters for different types of jobs. In SP2, you can add multiple custom filters to your cluster and use job templates to define which filters should run for a particular type of job. For example, you can ensure that an activation filter that checks for license availability only runs on jobs that require a license. This type of job-specific filter must be defined as a DLL (and will run in the same process as the HPC Job Scheduler Service), rather than as an executable like the cluster-wide filters (which run in a separate process). When a job is submitted or ready for activation, any job-specific filters will run before the cluster-wide filter.

  • Over-subscribe or under-subscribe core or socket counts on cluster nodes. In SP2, cluster administrators can fine tune cluster performance by controlling how many HPC tasks should run on a particular node. Over-subscription provides the ability to schedule more processes on a node than there are physical cores or sockets. Normally, if a node has eight cores, then eight processes could potentially run on that node. With over-subscription, you can set the subscribedCores node property to a higher number, for example 16, and the HPC Job Scheduler Service could potentially start 16 processes on that node. For example, this can be useful if part of the cluster workload consists of coordinator tasks that use very few compute cycles. Conversely, under-subscription provides the ability to schedule fewer tasks on a node than there are physical cores or sockets. This can be useful if you only want to use a subset of cores or sockets on a particular node for cluster jobs.

  • Give more resources to higher priority jobs by pre-empting lower priority jobs. SP2 includes a new job scheduler configuration option to enable a “Grow by pre-emption” policy. When this policy is enabled, the HPC Job Scheduler uses pre-emption to increase the allocated resources (“grow”) of a higher priority job towards its maximum. By default, pre-emption only occurs to start a job with its minimum requested resources (“graceful pre-emption” option enabled), and the job increases towards its maximum resources as other jobs complete (“increase resources automatically (grow)” option enabled). Enabling “Grow by pre-emption” helps ensure that high priority work can complete more quickly.

  • Configure the type of signal that is used to cancel a running task. SP2 includes a new environment variable, CCP_TASK_NOTIFY, that job owners can set at the job or task level to configure the type of signal that should be used to cancel their running tasks. By default, the HPC Node Manager Service stops a task by sending a CTRL_BREAK event to the application. A job owner can set the CCP_TASK_NOTIFY variable for a job or task if the application must be stopped with a CTRL_C event instead. If the application processes the CTRL_BREAK or CTRL_C event, then it can make use of the task cancel grace period to exit gracefully. If the application does not process the event, the task exits immediately.

Cluster management

The following features are new in cluster management:

  • Add workstation nodes that are in a separate doMayn. SP2 supports adding workstation nodes to your cluster that belong to a different doMayn than the head node. To join nodes from a different doMayn, you must specify the Fully Qualified DoMayn Name (FQDN) of the head node when installing HPC Pack on the workstations.

  • Automatically stop jobs on workstation nodes if the CPU becomes busy with non-HPC work. Administrators can configure workstation nodes to become available based on user activity detection. Workstations can automatically become available for jobs (come Online) if a specified time period has elapsed without keyboard or mouse input and if the CPU usage drops below a specified threshold. In SP1, HPC jobs are automatically stopped when keyboard or mouse input is detected. In SP2, HPC jobs are also stopped when the CPU usage for non-HPC work rises above the specified threshold. This helps ensure that if workstation users initiate or schedule work on their computer before leaving for the night, the HPC jobs will not interfere.

  • Validate environment configurations before creating a new cluster. SP2 provides a stand-alone tool, the Microsoft HPC Pack 2008 R2 Installation Preparation Wizard, that helps you check for operating system and environment configurations that can cause issues when creating a new cluster. You can run the wizard on the server that will act as the head node (before you install HPC Pack), or on another computer that is connected to the Enterprise network. In the tool, you answer questions about your intended configurations. The tool performs checks based on your answers, and then generates a report that lists results, installation warnings, best practices, and check lists. The preinstallation wizard in available on the HPC tool pack download page.

  • Export and import cluster configurations as part of disaster recovery plan. SP2 includes utilities that help export and import cluster configurations such as HPC user and administrator groups, node groups, node templates, job templates, job scheduler configuration settings, SOA service configuration files, and custom diagnostic test. Export-HpcConfiguration and Import-HpcConfiguration are implemented as .ps1 scripts (located in the %CCP_HOME%bin folder). You can import the saved settings onto a new cluster that is running the same version of HPC Pack. To continue submitting jobs on the new cluster, users only need to change the name of the cluster in their applications or in the HPC client utilities. To export cluster configurations to a folder named C:\HpcConfig, run HPC PowerShell as an Administrator and type export-HpcConfiguration –path c:\hpcconfig.

  • Start the HPC Job Scheduler Service in Restore Mode using an HPC PowerShell cmdlet. SP2 includes a new cluster property called RestoreMode that you can set when you need to start the HPC Job Scheduler Service in restore mode. Previously, you could enable restore mode by setting a registry key, now you can run HPC PowerShell as an Administrator and use set-hpcClusterProperty –RestoreMode:$true. When restore operations are complete, the property is automatically set back to False. The HPC Job Scheduler Service restore mode helps bring the cluster to a consistent state when you are performing a full-system or database restore. For more information, see Steps to Perform Before and After Restoring the HPC Databases from a Backup.

Runtime and development

The following features are new for runtime and development:

  • Common-data APIs for SOA workloads. SP2 includes new APIs that support staging and accessing common data that is required by all calculation requests within one or more sessions. You can create a new type of client called a DataClient. The data client includes methods to upload data to the cluster (to the runtime user data shared folder) and to read and write data. If you want the data to be available to other cluster users, you can specify the list of users when you call DataClient.Create(). Optionally, you can associate the data to the session lifecycle, so that when the session ends, the data is automatically deleted from the share. Code samples are available in the SDK code sample download. Common data features are not supported on Azure Nodes.

  • Runtime user data share created automatically to support SOA common data jobs. When you install SP2, the installation wizard includes a step to configure a shared folder for runtime user data. This share is used by the SOA common data runtime. For a production cluster, you can create a shared folder for the runtime data on a separate file server and then specify the path to that share in the SP2 installation wizard. If you are evaluating the common data features in a test cluster, or if you are setting up a small cluster, you can accept the default runtime data configuration during setup. The default configuration creates a hidden share on the head node to provide out-of-the-box functionality for the common data workloads.

  • In-process broker APIs available to help reduce communication overhead for SOA sessions. The SP2 APIs include an option to enable an in-process broker. The in-process broker runs in the client process, and thereby eliminates the need for a broker node, reduces session creation time, and reduces the number of hops for each message. For example, one usage pattern for the in-process broker is as follows: Instead of running the client application on a client computer, you submit the client application to the cluster as a single-task job. The client application creates a session on the cluster, and instead of passing messages through a broker node, the client sends requests and receives responses directly from the service hosts (compute nodes). Code samples are available in the SDK code sample download. The in-process broker supports interactive sessions only, and is not supported on Azure Nodes.

Additional resources