Using Windows Compute Cluster Server 2003 Job Scheduler

Applies To: Windows Compute Cluster Server 2003

Windows Compute Cluster Server 2003 operations overview

Microsoft® Windows® Compute Cluster Server 2003 brings high-performance computing (HPC) to industry standard, low cost servers. Jobs—discrete activities scheduled to perform on the compute cluster—are the key to Windows Compute Cluster Server 2003 operation. Cluster jobs can be as simple as a single task or can include multiple tasks. In some situations, tasks are serial, running as single processes; in others, they are parallel, running in multiple, intercommunicating processes. In either case, the ordering of the tasks in a job may be arbitrary or it may be determined by dependencies among the tasks. In addition, jobs and tasks can be targeted to specific nodes within the cluster. Nodes can be reserved exclusively for jobs or tasks or can be shared among jobs and tasks.

Note

For general information about Windows Compute Cluster Server 2003 features and capabilities, see the white paper Overview of Microsoft Windows Compute Cluster Server 2003 (https://go.microsoft.com/fwlink/?LinkId=56090). For deployment information, see the white paper Deploying and Managing Microsoft Windows Compute Cluster Server 2003 (https://go.microsoft.com/fwlink/?LinkId=55927).

The basic principle of job operation in Windows Compute Cluster Server 2003 relies on three key concepts:

  • Admission, or job submission

  • Allocation, or the reserving of resources.

  • Activation, or job start

These three concepts form the underlying structure of the job life cycle in high-performance computing and are the basis on which Microsoft engineered Windows Compute Cluster Server 2003. Figure 1 illustrates the core relationship between each aspect of job operation. Each time a user prepares a job to run in the compute cluster, the job runs through the three stages.

Job life cycle

Figure 1. The HPC job life cycle

To understand how a job operates within a compute cluster, users must understand the components that make up a cluster. Figure 2 illustrates these components and their relationship to one another. In this figure, the dashed line denotes the cluster itself. Several external elements are required in support of the cluster, including Microsoft® Active Directory® directory service. (Clusters must belong to the same Active Directory domain—in this case, the production Active Directory domain.) Other supporting elements may include a licensing server for applications and external data sources as well as user consoles. In this example, cluster nodes include several interconnected links: a public link to access data and the licensing server; a private link to the cluster for intra-cluster communications; and a high-speed, low-latency link for parallel computation execution.

Elements of the cluster

Figure 2. Elements comprising a compute cluster

The cluster itself consists of the head node and compute nodes. The head node is designed to run a job scheduler, add or remove compute nodes, and view job and node status. In other words, the head node manages cluster operations. Compute nodes execute the tasks of which jobs consist.

When a user submits a job to the cluster, the job is recorded in the head node database along with its properties, entered into the execution queue, and then run when the resources it requires become available. Because jobs are submitted in the context of the user and the user's domain, jobs execute using that user’s permissions. As a result, the complexity of using and synchronizing different credentials is eliminated, and the user does not have to use different methods of sharing data or compensate for permission differences among different operating systems. This means that Windows Compute Cluster Server 2003 offers transparent execution, access to data, and integrated security.

Windows Compute Cluster Server 2003 nomenclature

Windows Compute Cluster Server 2003 has a specific nomenclature. Users need to be familiar with this specific terminology to use Windows Compute Cluster Server 2003 effectively.

Cluster

A cluster is the top-level organizational unit of Windows Compute Cluster Server 2003. A cluster consists of the following elements:

  • Node. A single computer with one or more processors.

  • Queue. An organizational unit that provides queuing and job scheduling. Each cluster contains only one queue, and that queue contains pending, running, and completed jobs. Completed jobs are purged periodically from the queue.

  • Job. A collection of tasks that a user initiates. Jobs are used to reserve resources for subsequent use by one or more tasks.

Tasks

A task represents the execution of a program on given compute nodes. A task can be a serial program (single process) or a parallel program with multiple concurrent processes. Figure 3 illustrates common job and task types.

Common job and task types

Figure 3. Common job and task types

Compute Cluster Job Scheduler

The Job Scheduler queues jobs and their tasks. It allocates resources to these jobs, initiates the tasks on the compute nodes of the cluster, and monitors the status of jobs, tasks, and compute nodes. Job scheduling is performed through a set of rules called scheduling policies. Figure 4 illustrates the job scheduler stack and its interaction with the job life cycle. The stack consists of three layers, each corresponding to one aspect of the job life cycle:

  • The interface layer provides job and task submission, manipulation, and monitoring services accessible through various entry points.

  • The scheduling layer provides a decision-making mechanism that balances supply and demand by applying scheduling policies. The workload is distributed among available nodes among the cluster, implementing the concepts of job priorities, equitable sharing, and allocation of resources.

  • The execution layer provides the workspace used by tasks. This layer creates and monitors the job execution environment and releases the resources assigned to the task upon task completion. The execution environment supplies the workspace customization for the task, including environment variables, scratch disk settings, security context, and execution integrity, as well as application-specific starting mechanisms and recovery from system interruptions.

Software stack

Figure 4. The job scheduler stack and its interaction with the job life cycle

Compute Cluster Administrator

The Compute Cluster Administrator is a Microsoft Management Console (MMC) snap-in that allows easy compute cluster administration and deployment. The Compute Cluster Administrator is not typically used when running jobs and tasks in the cluster.

Compute Cluster Job Manager

The Compute Cluster Job Manager is a WIN32 graphical user interface (GUI) that provides access to the Job Scheduler for the creation, submission, and monitoring of jobs in the cluster.

Scheduling policies

Windows Compute Cluster Server 2003 consists of four scheduling policies, which are explained in detail later in this paper:

  • Priority-based first come, first served (FCFS)

  • Backfilling

  • Nonexclusive scheduling

  • License-aware scheduling

Each policy type is described here.

Task execution

Tasks operate in either serial or parallel modes. In serial mode, each task runs as a single process and parallelism consists of more than one such task running at the same time. Figure 5 illustrates how task 1 is assigned to the first processor on the first node, then task 2 is assigned to the second processor, task 3 moves to the first processor of the second node, and so on.

In parallel mode, a single task runs on multiple processors. Figure 6 illustrates a task running in parallel mode. Parallel tasks typically call upon Microsoft® Message Passing Interface (MPI) software (called MS MPI) through the executable mpiexec, which is installed on each compute node. Task processes are then started by the node-specific MS MPI Service. Each node can run only one instance of the service, but parallel tasks can call on several nodes to start MS MPI Services. The MS MPI Service on each node, in turn, executes the processes that make up the task. Windows File Sharing supports client-side caching, so applications have to be loaded only once. The application will reside on the compute node’s local disk after it has been loaded, which will speed processing. This configuration must be made on the file server to work and can be done when creating the file share.

The scheduler keeps track of each task and job through task and job ID numbers. It relies on these ID numbers to display task and job status information to Windows Compute Cluster Server 2003 users.

Serial task flow

Figure 5. Serial task execution

Parallel task flow

Figure 6. Parallel task execution

Admission

Several interfaces are available for job submission:

  • Compute Cluster Job Manager

  • the command-line interface (CLI)

  • a COM interface (CCPAPI) for integration with C or C++ custom interfaces and for scripting support

The CLI also supports a variety of scripting languages, including the Perl, C/C++, C#, and Java™ languages.

The Compute Cluster Job Manager includes powerful features for job and task management, and each feature has a corresponding equivalent in the CLI. Features are controlled by the Job Scheduler and include error recovery (that is, automatically retrying failed jobs or tasks and identifying unresponsive nodes); automated cleanup of jobs after they are complete to avoid “runaway” processes on the compute nodes; and security mechanisms (that is, each job runs in the user’s security context, limiting job and task access rights to those of the user initiating them).

As stated earlier, a job is a resource request that contains one or more tasks to be run within the cluster. Each task that makes up the job may in turn be either serial or parallel or a combination of both. An example of a job executing several serial tasks in parallel is a parametric sweep. Parametric sweeps consist of multiple iterations of the same executable that are run concurrently but use different input and output files. There is typically no communication between the tasks, and parallelism is achieved by the scheduler running multiple instances of the same application at the same time.

Users submit jobs to the system, and each job runs with the credentials of the user submitting the job. Jobs can be assigned priorities upon submission. By default, users submit jobs with the Normal priority. If the job needs to run sooner, cluster administrators can assign a higher priority to the job, such as AboveNormal or Highest. Because there is only one job queue, jobs with the highest priority tend to run first.

Jobs also consume resources—nodes, processors, and time—and these resources can be reserved for each specific job. Although one might assume that the best way to execute a job is to reserve the fastest and most powerful resources in the cluster, in fact, the opposite tends to be true. If a job is set to require high-powered resources, it may have to wait for those resources to be free so that the job can run. Jobs that require fewer resources with shorter time limits tend to be executed more quickly, especially if they can fit into the backfill windows (that is, the idle time available to resources) that the Job Scheduler has identified.

Another factor that affects execution time is node exclusivity. By default, jobs require node exclusivity. However, if jobs are defined to require nonexclusive node use, they have faster access to resources because resources can be shared with other jobs. (Any idle resource that can be shared can run other jobs as soon as it is available.)

Creating jobs

You create a job by first specifying the job properties, including priority, the run time limit, the number of required processors, requested nodes, and node exclusivity. After defining the job properties, you can assign tasks to the job. Each task must include the command-line commands to be executed; input, output, and error files to be used; as well as properties similar to those of the job in terms of requested nodes, required processors, the run time limit, and node exclusivity. Tasks also include dependency information, which dictates a specific order in which tasks must run.

You can use either the Compute Cluster Job Manager or the CLI to create jobs. In the Compute Cluster Job Manager, select File > Submit…. On the General tab of the resulting dialog box, enter job details such as the name, the project name (if appropriate), and the name of the submitter. Then select the priority and switch to the Processors tab to identify the minimum and maximum number of processors for the job. This tab also allows you to set the run time duration. Next, move to the Tasks tab to create the tasks associated with the job.

To create a job through the CLI, type the following command:

job new [standard_job_options] [/jobfile:<job_file>] [/scheduler:<host>]

where job file is an optional template file containing previously set options and host is the name of the head node of the cluster that the job will run on. Note that job files can easily be created in the Compute Cluster Job Manager by saving jobs as templates. Table 1 lists the standard properties available with the job command. Note that when dealing with job priorities, users have access only to Normal, BelowNormal, and Lowest, while administrators have access to all the available priorities. If users need to have their jobs run sooner than Normal priority would allow, they must ask a cluster administrator to increase the priority of their jobs.

Table 1. Windows Compute Cluster Server 2003 Job Properties

Property Description

Project Name

Specifies the project name with which the job is associated, for accounting purposes.

Job Name

Name of job.

Number of Processors

Specifies the minimum and maximum number of compute processors to be reserved across a set of nodes.

Asked Nodes

Specifies a list of nodes eligible to be used by the job.

Priority

Priority group in which the job is placed in the queue:

Highest

AboveNormal

Normal

BelowNormal

Lowest

(Highest and AboveNormal are available only to administrators.)

Run Time

Specifies the run time limit of the job.

Run Until Canceled

Tells the job to reserve its resources until the job is canceled or the run time limit is reached.

Exclusivity

Specifies whether the nodes are allocated to the job exclusively.

License

Specifies the license features and number of each required for the job to run.

Adding tasks to a job

Tasks are the discrete commands that jobs execute. You specify tasks through the Tasks tab on the Job Properties sheet. Each task is named, but task names do not have to be unique within a job. Tasks consist of executables that run on cluster resources, so when you create a task, you must enter a command-line command to tell the system which executable to run. If the task uses a Microsoft MPI executable, the task command must be preceded by mpiexec. Tasks can run executables directly or can consist of batch files performing multiple activities.

Note

To run tasks more efficiently, you can copy the executable to each compute node.

To create a task, select the Tasks tab of the Job Properties sheet. Then enter the command line required to execute the task and click Add Task. The task appears in the Task window, where you can further refine it with elements such as estimated number of processors and run time duration.

Click Edit on the Tasks tab to open the Task Properties sheet, where a new set of tabs appears: Task, Task Dependencies, Environment, and Advanced. The Task tab is designed for entering input and output files, estimated number of processors and run time duration. The Task Dependencies tab allows dependencies to be set between tasks and is mostly used for serial tasks. The Environment tab is designed for the integration of environment variables to the task. The Advanced tab supports the addition of information such as working folder and the nodes required to run the task, restarting the task if the task fails, and setting a checkpoint on the task. All these values are optional to the task.

Repeat the procedure for each task included in the job. Figures 7 and 8 show the interface for adding a task and the added task, respectively.

Entering task command lLine

Figure 7. Adding a single task

Task added

Figure 8. The added task

To create a task through the CLI, you type the following command:

job add <jobID> [standard_task_options] [/taskfile:<template_file>] <command> [arguments]

where jobID is the number of the job and command is the task command line. Table 2 lists the standard task options.

Table 2. Windows Compute Cluster Server 2003 Task Properties

Property Description

Task Name

Name of task.

Environment Variables

Environment variables for the task, with name and value of each.

Working Directory

Directory where the job looks for input files and writes output files.

Number of Processors

Specifies the minimum and maximum number of compute processors to be reserved across a set of nodes.

Required Nodes

Nodes that must be reserved for a task.

Exclusivity-Nonexclusivity

Exclusive allocation of nodes for a task.

Dependency

Inter-task dependencies.

Run Time

Specifies the run time limit of the task.

Rerunnable

Specifies that a task is to be rerun automatically after a failure if the failure is due to system error.

Checkpointable

Specifies that the task is checkpointable

Input File

Redirect standard input of the task from this file.

Output File

Redirect standard output of the task to this file.

Error File

Redirect standard error of the task to this file.

One key factor in admission is determining how the tasks will access the data required to the tasks to run. The way in which this determination is made depends on the amount of the data and the frequency of the changes to the data. If a data set is stable, does not change often, and is relatively large, it can be stored locally. If the data set is small, it can be accessed through a file share, and compute nodes can simply access it from the shared folder. If the data set is large and changes often, a file transfer will be necessary to place the data on the nodes. If you are using small and medium data set sizes, you will have the best out-of-box experience by specifying the working directory. When the task starts, compute nodes see all the files in this working directory and can properly handle the task.

Jobs that must work with parallel tasks through MPI require the use of the mpiexec command, so commands for parallel tasks must be in the following format:

mpiexec [mpi_options] <myapp.exe> [arguments]

where myapp.exe is the name of the application to run. MPI options include standard options supported by the Argonne MPICH2 distribution as well as extensions that have been added to MS MPI. You can submit MPI jobs either through the Job Manager or the command line. In practice, mpiexec options rarely need to be used, since most are set indirectly by the job and task options.

Figure 9 shows how you can quickly build a series of parametric tasks. Through this dialog box, you can generate a series of task command lines consisting of multiple iterations of the same command, automatically incrementing the file extensions of input, output, and error files that the program generates. The increment can be of any size.

Adding parametric sweep

Figure 9. Adding a parametric sweep task

In addition to the job and task commands in the CLI, you can call on the cluscfg command, which provides access to information about the cluster itself. A complete list of the options available for the job, task, and cluscfg commands appears in Tables 3.

Table 3. Job-related CLI Commands

Job Commands Action

job new [job_terms]

Create a job container

job add jobID [task_terms]

Add a task to a job

job submit /id:jobid

Submit a job created through the job new command

job submit [job_terms][task_terms]

Submit a job

job cancel jobID

Cancel a job

job modify [options]

Modify a job

job requeue JobID

Requeue a job

job list

List jobs in the cluster

job listtasks

List the tasks of a job

job view JobID

View details of a job

Table 4. Task-related CLI Commands

Task Commands Action

task view taskID

View details of a task

task cancel taskID

Cancel a task

task requeue taskID

Requeue a task

Table 5. Cluster-related CLI Commands

Cluster Commands Action

cluscfg view

View details of a cluster

cluscfg params/setparams

View or set configuration parameters

cluscfg listenvs/setenv

List or set cluster-wide environment

cluscfg delcreds/setcreds

Set or delete user credentials

Working with templates and submitting jobs

Jobs and tasks can be saved as templates by clicking Save as Template in the Job and Task Properties sheets. Template files are saved in Extensible Markup Language (XML) format and can therefore be edited in a text editor after they are created. It is good practice to save any recurring job or task as a template. Not only do the templates facilitate job or task recreation, they also support job and task submission through the command prompt window.

Job submission means placing items into the job queue. You can create and submit jobs interactively or from a job template. To create and submit a job interactively, go through each tab on the Job Properties sheet to ensure that the settings are correct and that the tasks you want are specified, and then click Submit from any tab. To submit jobs from templates, select File > Submit with Template…. From there, select the appropriate template, modify its properties (if necessary), and click Submit from any tab.

You can also submit jobs through simple command-line commands—for example, the command:

job submit /numprocessors:8 mpiexec mympitask.exe

where mympitask.exe is the name of the submitted application; this command both creates and submits an MPI job that requires eight processors.

Using the Job Scheduler C# API: Compute Cluster Pack Application Programming Interface (CCPAPI)

CCPAPI provides access to the Windows Compute Cluster Server 2003 Job Scheduler. By writing applications or scripts using these interfaces, you can connect to a cluster and manage jobs, job resources, tasks, nodes, environment variables, extended job terms, and more.

In its simplest terms, using CCPAPI is a five-step process.

  • Connect to the cluster

  • Create a job

  • Create a task

  • Add the task to the job

  • Submit the job for execution

To connect to the cluster, use the ICluster::Connect method. Create a job using the ICluster::CreateJob method. To create a task, use the ICluster::CreateTask method. To add a child task to a job, use the ICluster::AddTask method. Finally, submit the job using ICluster::SubmitJob().

The sample code below shows an example of how the CCPAPI can be used following the five step process outlined above.

using System;

using System.Collections.Generic;

using System.Text;

using Microsoft.ComputeCluster;

namespace SerialJobSub

{

class Program

{

static int Main(string[] args)

{

Cluster cluster = new Cluster();

try

{

cluster.Connect("myheadnode");

IJob job = cluster.CreateJob();

ITask task = new Task();

task.CommandLine = @"c:\myprog.exe arg1 arg2";

task.Stdout = @"c:\pi.out";

job.AddTask(task);

int jobId = cluster.AddJob(job);

cluster.SubmitJob(jobId, @"mydomain\myuserid", null, true, 0);

Console.WriteLine("Job " + jobId + " submitted successfully");

return 0;

}

catch (Exception ex)

{

Console.WriteLine(ex.Message);

return 1;

}

}

}

}

The Job Scheduler also has a COM interface. For details on how to use the COM API, see the Microsoft Compute Cluster Pack on the Microsoft Web site (https://go.microsoft.com/fwlink/?LinkID=55873).

Allocation

Resource allocation—together with job ordering within the queue—is controlled through scheduling policies. Windows Compute Cluster Server 2003 supports four policies, with each focusing on a specific scheduling issue:

Priority-based first come, first served (FCFS) — This scheduling policy combines FCFS and other priorities to run jobs. Jobs are placed in higher or lower priority groups when scheduled, based on the priority setting of the job itself, but when a job is placed within a group, it is always placed at the end of the queue.

Backfilling. — Backfilling maximizes node utilization by allowing a smaller job or jobs lower in the queue to run before a job waiting at the top of the queue, as long as the job at the top is not delayed as a result.

When a job reaches the top of the queue, a sufficient number of nodes may not be available to meet its minimum processors requirement. When this happens, the job reserves any nodes that are immediately available and waits for the job that is currently running to complete. Backfilling then utilizes the reserved idle nodes. Refer to Role of the Job Scheduler in the product documentation for more details.

Nonexclusive scheduling. — By default, a job will use the allocated nodes exclusively. Nonexclusive schedules do not exclusively reserve resources for execution, so resources that these jobs use can be shared. Shared resources can be consumed at either the job or the task level. By default, jobs run in Exclusive mode while tasks run in Shared mode.

License-aware scheduling. — Windows Compute Cluster Server 2003 can manage job schedules through special license filters that verify licensing requirements for each job. If license requirements are not met, the job fails. This method helps ensure job compliance. License-aware scheduling is implemented through an admission filter.

These policies should be used appropriately when allocating resources for jobs. You can perform allocation through any one of the job scheduler interfaces, and the allocation becomes part of the job properties.

The default resource allocation strategy processes the following terms (see Table 1 for descriptions):

  • numprocessors

  • askednodes

  • nonexclusive

  • exclusive

  • requirednodes

The scheduler sorts the candidate nodes using the “fastest nodes with the largest memory first” criterion—that is, it sorts the nodes according to memory size first, and then sorts the nodes by their speed. This behavior is the default for all the resource allocation strategies. Next, the scheduler allocates the CPUs from the sorted nodes to satisfy the minimum and maximum processor requirements for the job. The scheduler attempts to satisfy the maximum number of CPUs for the job before considering another job.

If the askednodes term is specified, the scheduler does not sort the nodes from the complete node list but from a sublist of requested nodes.

If the task term requirednodes is used, other directives are overridden and all the required nodes are reserved by the job.

You will get the maximum benefits from the cluster by properly configuring your jobs —such configuration requires a balance between the allocation of resources and the requirements of each job. In particular, administrators must look to the optimal use of backfill windows to achieve the greatest result with their cluster implementations. To facilitate the optimal use of resources when working with Windows Compute Cluster Server 2003, you should follow these guidelines when creating, submitting, and running jobs:

  • Do not use the default Infinite run time when submitting jobs. Instead, define a run time that is an estimate of how long the job should actually take.

  • Reserve specific nodes only if the job requires the special resources available on those nodes.

  • When specifying the ideal number of processors for a job, set the number as a maximum, and then set the lowest acceptable number as the minimum. If the ideal number is specified as a minimum, the job may have to wait longer for the processors to be freed so it can run.

  • Reserve the appropriate number of nodes for each task to run. If two nodes are reserved and the task requires only one, the task has to wait until the two reserved nodes are free before it can run.

These guidelines will help you make the greatest use of the resources available in any Windows Compute Cluster Server 2003 cluster.

Activation

Activation consists of actually starting jobs. Jobs run in the security context of the submitting user, which limits the possibility of runaway processes. Jobs can also be automatically requeued upon failure. Jobs are managed through their state transition. Job states are illustrated in Figure 10.

Job-task state transitions

Figure 10. Job state transition

Like jobs, tasks and nodes also have life cycles represented by status flags displayed in the Windows Compute Cluster Server 2003 GUIs. The status flags for jobs and tasks are identical, while nodes have unique status flags. Job and task status flags are listed in Table 6. Node status flags are listed in Table 7.

Table 6. Windows Compute Cluster Server 2003 Job and Task Status Flags

Job or Task Flag Description

Not_Submitted

Job or task has been created but not submitted and has no ID.

Queued

Job or task is submitted and is awaiting activation.

Running

Job or task is running.

Finished

Job or task has finished successfully.

Failed

Job or task has failed to finish successfully.

Cancelled

The user cancelled the job or task.

Table 7. Windows Compute Cluster Server 2003 Node Status Flags

Node Flag Description

Pending

A node has been added to the cluster, but the administrator has not yet approved it.

Ready

The node is approved and available.

Paused

The administrator paused the node. When nodes are in a Paused state, jobs that began running before the node was paused continue to run. Users cannot submit new jobs to paused nodes, but administrators can.

Unreachable

The head node cannot contact the compute node.

Controlling jobs through filters

Windows Compute Cluster Server 2003 includes several comprehensive features for job submission and activation. Jobs can consist of simple tasks or can include comprehensive feature sets. In terms of activation, administrators can control jobs through activation filters that run the job through a set of conditions before the job actually begins. Two types of filters can be implemented:

  • Submission filters

  • Activation filters

Administrators can use these filter sets together to verify that jobs meet specific requirements before they pass through the system. Examples of the conditions for these filters include:

Project validation. This condition verifies that the project name is that of a valid project and that the user is a member of the project.

Mandatory policy. This condition ensures that run times are not set to Infinite, which could negatively affect the performance of the cluster, and that the job does not exceed the user’s resource allocation limit.

Usage time. Usage time ensures that the user’s time allocations are not exceeded. Unlike the mandatory policy, this filter limits jobs to the overall time allocations users have for all possible jobs.

In addition, activation filters can ensure that jobs meet licensing conditions before they run. Filters are a powerful job-control feature that should be part of every Windows Compute Cluster Server 2003 implementation. The Job Scheduler invokes the filters by parsing the job file, which contains all the job properties. The exit code of the filter program tells the Job Scheduler what to do. Three types of values are possible:

0: it is okay to submit the job without changing its terms

1: it is okay to submit the job with changed terms

Any other value will cause the job to be rejected.

Security considerations for jobs and tasks

Windows Compute Cluster Server 2003 uses standard Windows mechanisms to manage security within the cluster context. Jobs are executed with the submitting user’s credentials. These credentials are stored in encrypted format on the local computer by the Job Manager, and only the head node has access to the decryption key.

The first time you submit a job, the system prompts for credentials in the form of a user name and password. At this point, the credentials can be stored in the credential cache of the submission computer. During transmission, credentials are encrypted using the Microsoft .NET Remoting channel; upon receipt at the scheduler, credentials are encrypted for storage with the Windows Data Protection application programming interface (DPAPI) and stored in the job database. When a job runs, the head node decrypts the credentials and uses another .NET Remoting channel to pass them along to compute nodes, where they are used to create a token, and then erased. All tasks are performed using this token, which does not include the explicit credentials. When jobs are complete, the head node deletes the credentials from the job database.

When the same user submits the same job for execution again, no credentials are requested because they are already cached on the local computer that is running the Job Manager. This feature simplifies resubmission and provides 256-bit AES (Rijndael) credential encryption throughout the job life cycle (see Figure 11).

Cluster security

Figure 11. The end-to-end Windows Compute Cluster Server 2003 security mode

Conclusion

Windows Compute Cluster Server 2003 makes HPC much more affordable. The system includes powerful job-control features that users access through the familiar Windows graphical environment, including flexible admission controls, simple and effective scheduling policies, and a reliable execution mechanism that incorporates standard Windows security features. The result is a workload management system which delivers resources to the end users in the way that maximizes the corporation’s business and productivity goals.

About the authors

Danielle Ruest and Nelson Ruest (MCSE, MCT, MVP Windows Server) are IT professionals specializing in systems administration, migration, and design. They are the authors of multiple books, most notably Windows Server 2003: Best Practices for Enterprise Deployments (McGraw-Hill/Osborne Media, 2003), Windows Server 2003 Pocket Administrator (McGraw-Hill/Osborne Media, 2003), and Preparing for .NET Enterprise Technologies (Pearson Education, 2001). Both authors work for Resolutions Enterprises Ltd. (https://go.microsoft.com/fwlink/?LinkId=66871).

References

Windows Server High Performance Computing

(https://go.microsoft.com/fwlink/?LinkId=55599)

Windows Server x64 Editions

(https://go.microsoft.com/fwlink/?LinkId=45559)

Overview of Microsoft Windows Compute Cluster Server 2003 (https://go.microsoft.com/fwlink/?LinkId=56090)

Deploying and Managing Microsoft Windows Compute Cluster Server 2003 (https://go.microsoft.com/fwlink/?LinkId=55927)

Using Microsoft Message Passing Interface

(https://go.microsoft.com/fwlink/?LinkId=55930)

Migrating Parallel Applications

(https://go.microsoft.com/fwlink/?LinkId=55931)

Debugging Parallel Applications with Visual Studio 2005 (https://go.microsoft.com/fwlink/?LinkId=55932)

For the latest information about Windows Compute Cluster Server 2003, see the Microsoft High-Performance Computing Web site (https://go.microsoft.com/fwlink/?LinkId=55599)