Deploying Applications to Windows Azure Nodes in a Windows HPC Cluster

Applies To: Microsoft HPC Pack 2008 R2, Microsoft HPC Pack 2012, Microsoft HPC Pack 2012 R2, Windows HPC Server 2008 R2

Starting with HPC Pack 2008 R2 with Service Pack 1 (SP1), HPC Pack includes built-in utilities and mechanisms that help cluster administrators deploy applications (such as executable files, SOA services, XLL files, and cluster-enabled Microsoft Excel workbooks) to Windows Azure nodes that are joined to an on-premises cluster in a Windows Azure “burst” scenario. By default, Windows Azure nodes cannot access on-premises resources and shared folders directly, so the methods that you use to deploy applications on Windows Azure nodes differ from the methods that you use to deploy applications on-premises. Additionally, Windows Azure nodes added to a Windows HPC cluster are deployed and reprovisioned dynamically, so the recommended methods for application deployment help ensure that applications are automatically available on new Windows Azure node instances.

In this topic:

  • Considerations for HPC application workloads for Windows Azure burst

  • Selecting a method to deploy applications and data to Windows Azure nodes

  • Using hpcpack and hpcsync

    • How to stage application files to Windows Azure storage

    • How to deploy staged packages to Windows Azure nodes

  • Accessing files from Windows Azure nodes

  • Additional references

Considerations for HPC application workloads for Windows Azure burst

Before deploying applications to Windows Azure nodes, evaluate whether your existing or planned HPC workloads will run or scale efficiently in Windows Azure. The detailed migration, development, and design considerations for HPC applications that run in a Windows Azure burst scenario are beyond the scope of this topic. In addition, the capabilities of the Windows Azure platform are continuously evolving. However, the following are current characteristics of successful burst to Windows Azure workloads with HPC Pack, particularly for large-scale deployments of Windows Azure nodes:

  • Highly distributed, single-node computations   These include many parametric sweep and certain service oriented architecture (SOA) jobs. Other job types, including Message Passing Interface (MPI) jobs, can run in a Windows Azure burst configuration. However, the current Windows Azure infrastructure is not designed for jobs that require high bandwidth and low latency communication between Windows Azure nodes, or that use task dependencies between nodes. For general information about HPC job types, see Understanding Parallel Computing Jobs. For detailed considerations to run MPI jobs, see Guidelines for Running MPI Applications in Windows Azure.

  • Computation time exceeds data movement time   Certain HPC workloads require uploading large amounts of data to Windows Azure for computation, or return large amounts of data that has been processed. Ensure that data movement is not a bottleneck in your HPC workflow.

  • File-based data access   Existing HPC applications that access data files on-premises can be readily migrated to run in Windows Azure by accessing data files that are uploaded to Windows Azure blob storage. New HPC applications can be developed to access the variety of storage types in Windows Azure. However, depending on the sensitivity of the data, legal requirements, cost considerations, and other factors, it may not be possible to store application data in Windows Azure.

  • “Bursty” workload pattern   The Windows Azure burst scenario is ideal for resource-intensive workloads that are not easily completed by using the fixed resources of an on-premises cluster. The workloads can include irregular computation spikes, regularly scheduled jobs, or one-off jobs.

For more information about running applications on Windows Azure nodes, see Guidelines for Running HPC Applications on Windows Azure Nodes.

Selecting a method to deploy applications and data to Windows Azure nodes

The method that you use to deploy applications and data (in some cases) to Windows Azure nodes depends on what is required to install your application, as outlined in the following table.

Installation requirements Method Availability

Installation can be accomplished by copying files and any dependencies such as DLLs or configuration files to the node.

  • Administrator packages and stages files to Windows Azure storage by using the hpcpack commands.

    See Using hpcpack and hpcsync.

  • Administrator deploys Windows Azure nodes. Packages are automatically deployed (copied and extracted) to new Windows Azure node instances, or administrator can manually deploy if nodes are already running.

    See Burst to Windows Azure with Microsoft HPC Pack.

HPC Pack 2008 R2 with SP1

Installation requires silently running an installer or requires additional configuration steps such as setting environment variables or creating firewall exceptions.

  • Administrator packages and stages files or installers to Windows Azure storage by using the hpcpack commands.

    See Using hpcpack and hpcsync.

  • Administrator specifies a startup script in the Windows Azure node template to configure environment or run install commands.

    See Use a Startup Script for Windows Azure Nodes.

  • Administrator deploys Windows Azure nodes. Packages are automatically deployed to new Windows Azure node instances and then the startup script runs as part of the provisioning process.

    See Burst to Windows Azure with Microsoft HPC Pack.

HPC Pack 2008 R2 with SP2

Installation and data distribution to Windows Azure nodes can occur during a preparation task at the time the job is run.

  • Administrator uses a Windows Azure storage tool such as AzCopy to upload installers and data files to a container in Windows Azure blob storage.

    See AzCopy – Uploading/Downloading files for Windows Azure Blobs.

  • Administrator packages and stages additional files or installers to Windows Azure storage by using the hpcpack commands.

    See Using hpcpack and hpcsync.

  • Administrator specifies a startup script in the Windows Azure node template to configure environment or run install commands.

    See Use a Startup Script for Windows Azure Nodes.

  • Administrator deploys Windows Azure nodes. Packages are automatically deployed to new Windows Azure node instances and then the startup script runs as part of the provisioning process.

    See Burst to Windows Azure with Microsoft HPC Pack.

  • Administrator creates a job with a node preparation task to perform additional installation steps or to download data from a Windows Azure blob storage container.

    See Define a Node Preparation Task.

HPC Pack 2008 R2 with SP1

Installation requires steps that are not easily scripted, and the application and application data can be accessed from a durable drive in Windows Azure.



  • Administrator uses Windows tools to create a virtual hard disk (VHD) and installs the application and any necessary application data on the VHD.

  • Administrator detaches the VHD and uses the hpcpack upload command to upload the VHD as a page blob to Windows Azure storage.

    See Using hpcpack and hpcsync.

  • Administrator specifies the application VHD that is in Windows Azure storage when configuring the Windows Azure node template.

    See Mount an Application VHD on Windows Azure Worker Nodes.

  • Optionally, administrator specifies a startup script for additional configurations or to set up shared folders on one or more nodes in the deployment.

    See Use a Startup Script for Windows Azure Nodes.

  • Administrator deploys Windows Azure nodes. Windows Azure node instances are created that mount the application VHD, any packages on storage are automatically deployed to the nodes, and then the startup script runs as part of the provisioning process.

    See Burst to Windows Azure with Microsoft HPC Pack.

HPC Pack 2012

Using hpcpack and hpcsync

Each Windows Azure node deployment is associated with a Windows Azure storage account that is specified in the node template. A cluster administrator can stage files (such as applications, SOA services, XLLs, cluster-enabled Excel workbooks, and utilities) to the storage account by using the hpcpack commands. You can use hpcpack create to package files or folders in a compressed format (.zip) that can be uploaded to Windows Azure storage. Each application, SOA service, or XLL must be packaged separately, and the package must include any required dependencies such as DLLs or configuration files. You can then use hpcpack upload to upload the package to the storage account. You can run the hpcpack commands from the head node, or on a computer that has the HPC client utilities installed.

All packages in the storage account are automatically deployed to new Windows Azure node instances during the provisioning process. This happens when you deploy a set of Windows Azure nodes by using the HPC management utilities, and if your node instances are reprovisioned automatically by the Window Azure system. The hpcsync command runs on each Windows Azure node and copies all packages from storage to the node, and then extracts the files. If you upload packages to storage after the Windows Azure nodes are started, you can deploy the packages by running the hpcsync command manually on each Windows Azure node.

Note

If you create multiple Windows Azure node templates that reference the same storage account, the same staged files will be deployed to all Windows Azure node sets. To deploy different files to different node sets, create a separate Windows Azure storage account for each Windows Azure node template.

The following diagram illustrates the basic workflow and mechanisms for copying applications to Windows Azure nodes:

Copy applications to Azure Nodes in Windows HPC

By default, hpcsync extracts files to a location that is specified by the CCP_PACKAGE_ROOT environment variable. This variable is set on Windows Azure nodes during the provisioning process. The extracted files are placed in a folder that is determined as follows: %CCP_PACKAGE_ROOT%\<packageName>\<uploadTimeStamp>. This is the expected location for SOA services, XLLs, and Excel workbooks. However, this is not convenient for applications that cluster users will be calling in their command lines. To simplify the folder structure for executable files, you can set the relative path property for the package when you upload it to storage. hpcsync applies the relative path when extracting the files, so that the path is determined as follows: %CCP_PACKAGE_ROOT%\<relativePath>. Users can then specify the path to their application as in the following example of a job submit command: job submit %CCP_PACKAGE_ROOT%\myRelativePath\myapp.exe

The following are important considerations about hpcsync and CCP_PACKAGE_ROOT:

  • On Windows Azure worker nodes, the %CCP_PACKAGE_ROOT% folder is created on a 10 GB disk partition. This means that all application files on a node instance cannot exceed 10 GB. If an application has considerable input and output files, you can use a startup script to grant user permissions on the C:\ drives so that users can write to all available scratch space on the node.

  • When you run hpcsync manually, you can override the default location (%CCP_PACKAGE_ROOT%). For example, you can create a folder on each Azure Node and then specify that location when you run hpcsync. All packages will be extracted to that folder. However, any new node instances that are deployed (or automatically reprovisioned) will not include that folder, and packages will be automatically deployed to the default location. Additionally, cluster users only have write permissions to folders in %CCP_PACKAGE_ROOT%. Unless you modify folder permissions on the Windows Azure nodes, only administrators can run applications outside of %CCP_PACKAGE_ROOT%.

  • When hpcsync deploys a package, none of the extracted files can have a full path length longer than 256 characters. The root directories where the extracted files are temporarily and then finally placed can take up to 136 characters, leaving 120 characters for the file name, subdirectories (if any), and the relativePath (if specified). If the path for the extracted files exceeds 256 characters, the package deployment fails.

The hpcsync mechanism is sufficient for deploying SOA services, XLL files, and applications that can be installed by simply copying files to a node. If you need to run an installer to install an application, or if the application requires additional configuration steps such as setting environment variables, adding firewall exceptions, modifying folder permissions, or creating folders, you can include a startup script in the node template. This script will run during the provisioning process after hpcsync runs, and can be used to configure the nodes and perform the required application installation steps.

How to stage application files to Windows Azure storage

This section provides information about how to package applications and stage them to Windows Azure storage by using hpcpack. Staged packages are automatically deployed to new Windows Azure node instances that you provision (or that are automatically reprovisioned by the Windows Azure system).

Note

You must be a cluster administrator or at least have the Windows Azure subscription ID and storage account key to stage files to Windows Azure storage.

Requirements

If you are packaging a SOA service:

  • The name of the package must be the name of the SOA service (that is, the service name that the SOA client specifies in the SessionStartInfo constructor). For example, serviceName.zip or serviceName_serviceVersion.zip.

  • You must include the service DLL, any dependent DLLs, and the service configuration files in the package.

  • The service configuration file must also be deployed to the head node. All settings are determined by the on-premises copy of the configuration file.

  • Do not specify a relative path when you upload the package. SOA services must be unpacked to the default location.

If you are packaging an XLL file:

  • The name of the package must be the name of the XLL file. For example, XLLName.zip.

  • If the XLL has dependencies, place the XLL and supporting files in a folder and package the folder. The XLL must be in the top level of the folder (not in a subfolder).

  • Do not specify a relative path when you upload the package. XLLs must be unpacked to the default location.

If you are packaging an Excel workbook:

  • The name of the package must be the name of the workbook. For example, workbookName.zip.

  • If the workbook has dependencies, place the workbook and supporting files in a folder and package the folder. The workbook must be in the top level of the folder (not in a subfolder).

  • Do not specify a relative path when you upload the package. Workbooks must be unpacked to the default location.

If you are packaging an executable file (such as an MPI application), application installer, or utility that you will call from a startup script:

  • You must include any dependent DLLs or files in the package.

  • When you upload the package, specify the relative path property.

If you are packaging a startup script:

  • The name of the package must be the name of the startup script. For example, startup.bat.zip.

  • Do not specify a relative path when you upload the package. The startup script must be unpacked to the default location.

  • If your startup script calls installers or utilities, ensure that you package and stage the required files separately.

Steps

As examples, the following procedures illustrate how to stage various types of application files to Windows Azure storage.

Note

You do not need an elevated command prompt (run as Administrator) to run hpcpack create. However, hpcpack upload requires elevation. To perform the following procedures, run the commands in an elevated command prompt window.

To package and stage a SOA service to Windows Azure storage

  1. If the SOA service is not already registered and deployed to the on-premises cluster, register the SOA service by placing a copy of the service configuration file in the service registration folder on the head node (typically this is %CCP_HOME%\ServiceRegistration). For detailed information, see Deploy and Edit the Service Configuration File.

  2. Copy the service configuration file, the service assembly, and any dependent DLLs to an empty folder. For example, copy the files to a folder named C:\myFiles\myServiceFiles.

  3. At an elevated command prompt, run hpcpack create and specify a name for your package and the folder that contains your service files.

    Important

    The name of the package must be the name of the SOA service (that is, the service name that the SOA client specifies in the SessionStartInfo constructor).

    For example, to package the content of C:\myFiles\myServiceFiles as myServiceName.zip (and save the package to a folder called AzurePackages):

    hpcpack create C:\AzurePackages\myServiceName.zip C:\myFiles\myServiceFiles

  4. Run hpcpack upload to upload the package to Windows Azure storage by using the following command, where myHeadNode is the name of your head node, and myAzureTemplate is the name of the template that you used to deploy the Windows Azure nodes. For example:

    hpcpack upload C:\AzurePackages\myServiceName.zip /nodetemplate:myAzureNodeTemplate /scheduler:myHeadNode

To package and stage an XLL file or Excel workbook to Windows Azure storage

  1. If the XLL or workbook has dependencies on DLLs or other files, copy the XLL or workbook and its dependencies to a folder, such as c:\myFiles\myExcelFiles.

  2. At an elevated command prompt, run hpcpack create to package your XLL or workbook. Specify a name for the package and specify the XLL or workbook. The name of the package must be the name of the XLL file or of the Excel workbook.

    For example, if your XLL or workbook has dependencies, package the entire folder (and save the package to a folder called AzurePackages):

    hpcpack create C:\AzurePackages\myXLL.zip C:\myFiles\myExcelFiles

    If your XLL or workbook has no dependencies, you can package it directly. For example, to package C:\myFiles\myXLL.xll as myXLL.zip:

    hpcpack create C:\AzurePackages\myXLL.zip C:\myFiles\myXLL.xll

  3. Run hpcpack upload to upload the package to Windows Azure storage by using the following command, where myHeadNode is the name of your head node, and myAzureTemplate is the name of the template that you used to deploy the Windows Azure nodes. For example:

    hpcpack upload C:\AzurePackages\myXLL.zip /nodetemplate:myAzureNodeTemplate /scheduler:myHeadNode

To package and stage an executable file to Windows Azure storage

  1. Copy the executable and any dependencies or DLLs to a folder, such as C:\myFiles\myAppFiles.

  2. At an elevated command prompt, run hpcpack create to package your application files. Specify a name for your package and specify the folder that contains your application files.

    For example, to package the content of c:\myFiles\myAppFiles as myApp.zip (and save the package to a folder called AzurePackages):

    hpcpack create c:\AzurePackages\myApp.zip c:\myFiles\myAppFiles

  3. Upload the package to Windows Azure storage by using the following command, where myHeadNode is the name of your head node, and myAzureTemplate is the name of the template that you used to deploy the Windows Azure nodes. Specify a relative path for the application files. For example:

    hpcpack upload c:\AzurePackages\myApp.zip /scheduler:myHeadNode /nodetemplate:myAzureTemplate /relativepath:myApp

How to deploy staged packages to Windows Azure nodes

Packages that are staged to Windows Azure storage are automatically deployed to new node instances. You can manually deploy packages - for example, to verify that you have all the necessary dependencies in a package before automating the deployment to all new nodes, or to deploy packages to nodes that are already running. You can use clusrun and hpcsync to deploy the files from the Windows Azure storage account to the Windows Azure nodes.

For example:

clusrun /nodegroup:AzureWorkerNodes hpcsync

To see a list of folders or files that have been deployed to the Windows Azure nodes, you can run the following command:

clusrun /nodegroup:AzureWorkerNodes dir %CCP_PACKAGE_ROOT% /s

Accessing files from Windows Azure nodes

If your HPC application requires file access, the following are options for accessing files from the applications that are deployed to Windows Azure nodes.

Option Prerequisites Notes

Windows Azure drive

Administrator configures and mounts an application VHD on Windows Azure nodes.

See Mount an Application VHD on Windows Azure Worker Nodes.

  • Drive is copied and cached locally on each node

  • Drive can be written by only one node at a time

File server on Windows Azure Virtual Machine

Administrator configures a Windows Azure virtual machine instance, attaches a data disk to the virtual machine, enables the File Server role, and creates a file share folder.

See How to Configure a Windows Azure Virtual Machine File Server and Use it from within Windows HPC Server Compute Jobs.

  • Access by jobs to file server resources requires authenticated user access.

  • Provides SMB compatibility for existing applications

  • Provides a maximum of 16 TB of data per server

  • Limits bandwidth to 800 Mb/s

Mirror local files to Windows Azure blob storage

Administrator uses a Windows Azure storage tool such as AzCopy to mirror on-premises files to a container in Windows Azure blob storage.

See AzCopy – Uploading/Downloading files for Windows Azure Blobs.

  • Mirroring data from on-premises environment to Windows Azure adds overhead.

Access Windows Azure blob storage directly

Application is architected to perform data access operations directly on Windows Azure blobs

  • Provides highest scalability of available options

Additional references