Create a Parametric Sweep

Applies To: Windows Compute Cluster Server 2003

A parametric sweep is a parallel computing job that consists of running multiple iterations of the same command using different input and output files. The input and output files typically consist of the same stem name with incrementing extension numbers. A parametric sweep can be continuous, running all input files, or discontinous, running every other file, every third file, or whatever sampling interval you choose. Windows Compute Cluster Server 2003 supports parametric sweeps at two levels:

  • Using the Job Manager index function

  • Using batch and script files

Using the Job Manager index function

On the Job Properties sheet the Add Parametric Sweep button on the Task tab allows you to specify multiple iterations of a command with different input, output, and error files. The iterations and their associated files are automatically generated and indexed. For example, when you enter myapp.exe, and then enter my_input* and my_output* respectively as input and output files, you will create the equivalent of the following tasks:

myapp.exe <my_input1 >my_output1

myapp.exe <my_input2 >my_output2

myapp.exe <my_input3 >my_output3

myapp.exe <my_input4 >my_output4

myapp.exe <my_input50 .>my_output50

Each of the 50 instances of the myapp.exe command line will become the command line of a separate task. All tasks will share the same task name. Having multiple tasks with the same task name is not only allowed, but is useful in specifying dependencies. Multiple tasks of the same name are distinguished by unique task ID numbers.

The results shown above reflect the default Index End value of 50. Setting Index End to 11 will produce:

myapp.exe <my_input1 >my_output1

myapp.exe <my_input2 >my_output2

myapp.exe <my_input3 >my_output3

myapp.exe <my_input4 >my_output4

myapp.exe <my_inpu11t .>my_output11

Similarly, my_input1 and my_output1 reflect a defaultIndex Start value of 1. By changing Index Start and Index End values, you can increase, decrease, or shift the range that you are creating. By setting Index Skip to a non-zero value, you can change the numbering from a continuous series like 1,2,3 to a discontinuous series like 1,3,5 or 1,4,7. Finally, you can add one or more zeros to the front of each input, output, or error file using Extension Length. An extension length of 3 produces:

myapp.exe <my_input001 >my_output001

If the application takes a file as an argument, the command line itself can be indexed. For example:

myapp.exe -infile file*

This produces a series like the following:

myapp.exe -infile file1

myapp.exe -infile file2

myapp.exe -infile file3

myapp.exe -infile file4

To use the Job Manager index function

  1. Navigate to the Tasks tab of the Job Properties sheet.

  2. Click Add Parametric Sweep….

  3. On the Add Parametric Sweep dialog, enter the task name. This name is assigned to each iteration of the task. Each task in the parametric sweep will appear in the Task Summary pane with the same name.

  4. Enter the command line exactly as you would enter it at the command prompt. If the command line includes arguments to be distinguished by an incrementing number from one iteration to the next, represent the number as an asterisk.

  5. Enter Standard Input, Standard Output, and Standard Error file names as required, using an asterisk to represent a file number that will increment from one iteration to the next. For example: infile*.

  6. Check Assign Work Directory if different from the default (%USERPROFILE%) and enter the work directory.

  7. Select an Index Start number. Default is 1.

  8. Select an Index End number. The command line and associated files will be replicated over the range of numbers between Index Start and Index End, producing a list of tasks in the Preview Task pane.

  9. If you want to run only a sampling of the tasks, specify a sampling interval using an Index Skip number. The command list will be reduced to the number that will actually be sampled.

    Note

    The Index Skip value is sampling interval minus 1. If you want every other command line executed, the sampling interval is 2, but the skip number is 1. The beginning command line with the Index Start number is never skipped.

Using batch and script files

Batch and script files are a common method of performing parametric sweeps. Windows Compute Cluster Server 2003 supports this methodology by accepting scripts and batch file names to create and submit entire jobs or as task command lines. See Use Batch and Script Files with Compute Cluster Jobs.