Understanding Job Types

Applies To: Windows HPC Server 2008

The most common types of parallel computing jobs that you can run in Windows HPC Server 2008 are: MPI jobs, parametric sweep jobs, and task flow jobs. The following sections describe these three job types:

  • MPI job

  • Parametric sweep job

  • Task flow job

Note

The three types of jobs are not mutually exclusive. A job can contain many tasks, some of which are parametric, some serial, and some parallel. For example, you can create a task flow job consisting of MPI and parametric tasks.

MPI job

MS-MPI, a Microsoft implementation of Message Passing Interface (MPI) developed for Windows, allows MPI applications to run as tasks on an HPC cluster. For a task that runs an MPI application, the task command must be preceded by mpiexec.

An MPI task is intrinsically parallel. A parallel task can take a number of forms, depending on the application and the software that supports it. For an MPI application, a parallel task usually consists of a single executable that is running concurrently on multiple cores, with communication occurring between the processes.

The following diagram illustrates a parallel task:

Three processes, two-way communication

Note

For parallel tasks, Windows HPC Server 2008 includes an MPI package based on Argonne National Laboratory's MPICH2 standard. Microsoft’s implementation of MPI, called MS-MPI, includes the launcher mpiexec, an MPI service for each node, and a Software Development Kit (SDK) for user application development. Windows HPC Server 2008 also supports applications that provide their own parallel processing mechanisms. For more information about the SDK, see Microsoft HPC Pack (https://go.microsoft.com/fwlink/?linkid=123849).

For information about how to create an MPI task, see Add an MPI Task.

Parametric sweep job

A parametric sweep job consists of multiple instances of the same application, usually a serial application, running concurrently, and with input supplied by an input file and output directed to an output file. The input and output are usually a set of indexed files (for example, input1, input2, input3…, output1, output2, output3…) set up to reside in a single common folder or in separate common folders. There is no communication or interdependency among the tasks. The tasks may or may not run in parallel, depending on the resources that are available on a cluster when the job is running.

The following diagram illustrates a parametric sweep job:

Four serial tasks, separate input and output files

For information about how to create a parametric sweep task, see Add a Parametric Task.

Task flow job

In a task flow job, a set of unlike tasks are run in a prescribed order, usually because one task depends on the result of another task. You can establish the order in which tasks are run by defining dependencies between the tasks.

The following diagram illustrates a task flow job:

Task flow job, dependent tasks

Task 1 runs first. Note that only Tasks 2 and 3 may run in parallel, because neither is dependent on the other. Task 4 runs after Tasks 2 and 3 have both completed.

For information about how to define dependencies between the tasks, see Define Task Dependencies.

Additional references