Install and License Parallel Applications

Applies To: Windows Compute Cluster Server 2003

Cluster computing is characterized by single applications running on multiple computers. This can sometimes require multiple installations of the program (one per compute node) and typically requires multiple licenses.

An average-sized program can usually be installed in one central location and run exactly as if it were a serial program. However, for larger programs and for some interactive programs, it can be necessary to install the program on all compute nodes to be used by a job.

License requirements usually scale with number of compute nodes or processes. As a result, a floating or site license is usually the best choice. Because jobs will fail if too few licenses are available at the time of activation, a license-aware activation filter is a highly useful added custom feature when using floating licenses.

This topic describes how jobs are run on Compute Cluster Server 2003 and how to select the best installation and licensing options for your application. It also provides general steps for implementing the most common installation scenarios.

Note

Installing and licensing user applications for parallel computing is usually a coordinated effort between the cluster administrator and the user, requiring the administrator's permissions and oversight over shared resources and the user's specific knowledge of the application and its user base.

This section contains the following topics:

  • How Windows Compute Cluster Server 2003 runs programs

  • Variables to consider in installing and licensing applications

  • Installing parallel programs in parallel

  • Steps for installing and running programs

How Compute Cluster Server 2003 runs programs

As explained in Task Execution, user applications are invoked by Job Scheduler when the job is activated. Specifically, Job Scheduler invokes the task command line on the compute node that it designates for the task. Unless the program is installed on that designated node, the operating system copies it there. If a license token is required, the task process will then obtain it from where it is stored, locally or on a license server. When the task is complete, the copy is erased.

Variables to consider in installing and licensing applications

The best way to install and license applications in Windows Compute Cluster Server 2003 depends on four variables:

  • Program size

  • The host or hosts

  • License requirements

  • Special cases

Program size

If an application is not installed on a compute node, the compute node needs to load a copy of it before it can execute it. Because downloading files takes time, even in parallel, file size is usually the primary determinant of where the file is to be installed. A small application can be downloaded very quickly from almost anywhere. Large applications, on the other hand, can be slow to load, consuming much of the time that is saved by parallel processing. This is particularly true in the case of a remote application server when the compute nodes have no direct access to the public network.Therefore, as a rule, large applications should be installed locally on the cluster and very large applications should be installed on all the compute nodes that will use them. Exceptions are programs that are changed often or those that are so long-running that download time is not an important factor.

The host or hosts

An application can be installed on a remote application server or within the cluster, and within the cluster it can be installed on a local application server or on each compute node that will run it. As described with reference to program size, each of these options is more efficient than the last in shortening or eliminating application download time. The choice, however, can be limited by the amount of storage space on the compute nodes, particularly for systems with many users. If an application server is the only option, the following recommendations apply:

  • A local application server (one that is part of the private cluster network) is more efficient than a remote application server. A local application server can be dedicated or it can be a node that is part of the cluster (a head node or a compute node).

  • If a remote application server is used, the compute nodes must be on the public network or the cluster must be configured with a NAT on the head node.

Installing an application on all compute nodes not only eliminates download time but allows the application to be called by its executable name alone. This is because whichever compute node is chosen by Job Scheduler as the designated node, the application will be local to it. If the application is installed in one location only, it must called by a UNC path to that location.

License requirements

License requirements depend on the software provider. A licensed program generally requires a separate license token for each node or each process that runs concurrently, whether or not the program is installed on the nodes. Typically, a site license or floating license scheme provides the best solution for parallel computing. For a floating license, some means of license scheduling is recommended.

License scheduling is a licensing policy, like first come, first served (FCFS) and backfilling. Unlike these policies, Job Scheduler does not implement license scheduling for parallel applications. Instead, license scheduling must be implemented by the user in choice of run time, or by adding an optional activation filter. For more information and sample source code for a license scheduling activation filter, see Adding Job Submission and Activation Filters.

Using an activation filter, you can treat available licenses as a resource, requesting the number of licenses that correspond to the number of processors requested, and postponing job activation until the minimum number of licenses becomes available. Without the filter, the job will become activated with or without the required number of licenses; if there are too few licenses, the job will fail.

Special cases

Multiple, per-node installations of the program are required in three special cases:

  • When the software provider requires it.

  • When running .NET programs. Because of enhanced security, .NET programs cannot always be downloaded and run locally and should be installed on all compute nodes.

  • When the application adds registry entries during setup.

Installing parallel programs cluster-wide

Most programs can be installed simultaneously on all compute nodes using the clusrun command. For example:

clusrun \\installshare\myapp\setup.exe /unattend /install_option=1

Steps for installing and running programs

Parallel programs may reside on the head node only, on a file server only, or on the head node and each compute node. The following procedures show how best to install and set up programs for each scenario and how to invoke them in a task command line to run as a task.

To set up and run a parallel program that is installed on a application server

  1. Install the program on the application server.

  2. If the folder containing the program is not already shared, set the folder properties to allow sharing, as required by the operating system on the server. If this is a Windows operating system, right-click the folder, click Properties, click the Sharing tab, and then click Share this folder.

  3. When creating a job, enter the program file name as the task command line, using the UNC path for the remote program. For example:

    \\Applicationserver\Apps\myapp.exe

    Note

    In the CLI, directory names containing spaces must be placed in double quotes.

To set up and run a parallel program that is installed on all cluster nodes

  1. Install the program on all compute nodes.

  2. If the program install utility does not automatically set a search path, set a cluster-wide environment variable Path as follows:

    cluscfg setenvs Path=<search_path>

    where search_path is the full path to the executable.

  3. When creating a job, at the task command line, enter the executable name alone. This will cause each compute node to locate and run the local copy of the executable. For example:

    myapp.exe

See Also

Concepts

Adding Job Submission and Activation Filters
Integrating with Interactive Programs