<
Running Array Jobs on barkla

Running Array Jobs on barkla



1. Introduction
2. Simplified Job Submission Files
3. Job Submission Attributes for all Types of Jobs
4. Command Line Options
5. MATLAB Applications
6. R Applications
7. Generic Applications

1. Introduction

Although the barkla cluster is primarily intended for use with parallel (e.g. MPI) applications it can also be used for submitting batches of multiple serial jobs which differ only in their input files - these are termed array jobs. Array jobs are useful in applications such as parameter space explorations, Monte Carlo analysis and statistical modelling where the same processing is applied to different input data (in the case of Monte Carlo methods, individual jobs may only differ in their random number generator seeds).

In each case the user will need to store the input data for each job in a different input file in such a way that the each job can select its correct input file. The most convenient way of doing this is to number the input files for example:

input0
input1
input2
...
input<N-1>

where there are N jobs. We'll call these the indexed input files here and the integers [0..N-1] are the index values. Although UNIX does not support file extensions in the same sense as Windows, the index value can also be inserted between the file's "basename" and "extension" e.g.

input0.txt
input1.txt
input2.txt
...
input<N-1>.txt

Corresponding to the indexed input files, there will also be a collection of numbered output files produced by the jobs which we will call indexed output files e.g.

output0
output1
output2
...
output<N-1>
or
output0.txt
output1.txt
output2.txt
...
output<N-1>.txt

Where the jth output file corresponds to jth input file viz: inputj -> outputj, inputj.txt -> outputj.txt. In some applications there may be also be input data which is common to all jobs which can be stored in common input files.

Strictly speaking, each individual job in an array job is referred to as a job task in the SLURM scheduler and the term job used for the entire array. This is an example of an array job consisting of 10 tasks which is waiting to run on barkla:

$ squeue -u smithic
             JOBID PARTITION     NAME     USER ST       TIME  NODES NODELIST(REASON)
        3326_[0-9]     nodes r_app.sh  smithic PD       0:00      1 (Priority)

The job-ID in this case is 3326 and the job tasks are numbered with indices [0-9]. Once the individual tasks start to run you can see the individual task-IDs e.g.:

$ squeue -u smithic
             JOBID PARTITION     NAME     USER ST       TIME  NODES NODELIST(REASON)
            3326_0    cooper r_app.sh  smithic  R       0:01      1 node003
            3326_1    cooper r_app.sh  smithic  R       0:01      1 node003
            3326_2    cooper r_app.sh  smithic  R       0:01      1 node003
            3326_3    cooper r_app.sh  smithic  R       0:01      1 node003
            3326_4    cooper r_app.sh  smithic  R       0:01      1 node003
            3326_5    cooper r_app.sh  smithic  R       0:01      1 node003
            3326_6    cooper r_app.sh  smithic  R       0:01      1 node003
            3326_7    cooper r_app.sh  smithic  R       0:01      1 node003
            3326_8    cooper r_app.sh  smithic  R       0:01      1 node003
            3326_9    cooper r_app.sh  smithic  R       0:01      1 node003

Users can prepare array jobs themselves and submit them using the standard SLURM commands however a number of tools have been developed to make the process easier. These tools ensure that all of the input and output file "indexing" is done "behind the scenes" so that users to not have to change their executable code/scripts.

To make this a bit clearer, imagine that your code loads data from input.txt, processes it and writes the output to results.txt e.g.

input_data = load(input.txt)
....
....  process input_data to give output_data
....
save(results.txt, output_data)

With the job submission tools, you can keep exactly same code and just specify the indexed input files as input.txt and the indexed output files as results.txt. Obviously you will need to create N indexed input files numbered [0..N-1] but everything else is done for you. The tools can be used to submit MATLAB array jobs, array jobs for R applications and other "generic" applications where you have you have your own or third party executables.

2. Simplified Job Submission Files

Key to the array job tools are something we will call job submission files which contain information on input and output files as well as things like job run times, memory requirements etc. Each file contains a number of attribute=value pairs, one pair per line. Blank spaces and blank lines are ignored and comments can be added by prefixing the text with a hash '#' character e.g.

# a comment - blanks are OK anywhere
input_files  =     common1,subfunc.m
indexed_input_files = iia.txt,iib
indexed_output_files = outa.txt,outb.txt
# blanks lines are OK



cores_per_job = 2
runtime = 1h    # one hour - end of line comment
memory_gb = 8

M_file = hello.m
total_jobs =   2 

Some attributes are specific to certain applications (for example the M_file attribute used here is for MATLAB) but the following are common to all applications.

3. Job Submission Attributes for all Types of Jobs

indexed_input_files

A single filename or list of filenames for input files that are different for each job task. For example, for a single batch of input files you would use something like:

indexed_input_files = input.txt

and this will refer to N input files input0.txt, input1.txt ... input<N-1>.txt Multiple filenames should be separated by commas e.g.

indexed_input_files = input_a.txt,input_b

This would correspond to two sets of input files: input_a0.txt, input_a1.txt ... input_a<N-1>.txt and input_b0, input_b1 ... input_b<N-1>. Your code can just read the input files without modification e.g.

input_a_data = load(input_a.txt)
input_b_data = load(input_b)

indexed_output_files

A single filename or list of filenames for output files that will be different for each job task. For example, for a single batch of input files you would use something like:

indexed_output_files = output.txt

and this will refer to N input files output0.txt, output1.txt ... output<N-1>.txt. Multiple filenames should be separated by commas e.g.

indexed_input_files = output_a.txt,output_b

This would correspond to two sets of output files: output_a0.txt, output_a1.txt ... output_a<N-1>.txt and output_b0, output_b1 ... output_b<N-1>. Your code can just write the input files without modification e.g.

save(input_a.txt, output_a_data)
save(output_b, output_b_data)

common_input_files

A single filename or list of filenames for input files that are the same for each job task. Multiple filenames should be separated by commas e.g.

common_input_files = common_data_a.txt, common_data_b.txt

cores_per_job

The number of cores allocated to each job task (default 1). Using the option can useful if your application uses multi-threading to speed up execution. In this case, set cores_per_job to be the same as the (maximum) number of threads your application employs. If the number of threads exceeds the values of cores_per_job, the performance of your jobs may suffer as may those of other users who happen to be running on the same node. On barkla each of the "ordinary" compute nodes have 40 cores.

memory_gb

The amount of memory in GB allocated to each job task. If your application is particularly memory hungry then it is important to set this value to the maximum amount of memory used by your code as SLURM will terminate any jobs that exceed the default memory limits (~ 9.6 GB/core => ~ 380 GB per node on the ordinary compute nodes).

runtime

Maximum time the job tasks will run for expressed in hours (e.g. 48h) or days (e.g. 2d). Although this attribute is optional, it should be used for relatively short jobs as SLURM will prioritise these over longer running jobs (other things beings equal) so you jobs will spend less time queueing. However this will only happen if you specify the runtime explictly. Note that time limits are enforced and jobs will be terminated if they exceed them so it is best to err on the side of caution (or at least start with a long runtime and work downwards).

stdout

Specifies a file where the combined standard output and standard error for all job tasks are directed to. Note SLURM, unlike other schedulers, merges the standard and output and error streams together by default.

indexed_stdout

Specifies "indexed" output files for standard output and error similar to indexed_output_files. This allows the merged standard output/error for each individual job task to be written to a separate file.

total_jobs

Total number of job tasks to be run (must match the correct number of indexed input files).

scratch

Can be used to specify the temporary storage area where the job tasks will run. This may well speed up execution as local storage is considerably faster than the home filestore where job files are usually stored long term. The following storage areas can be specified:

namelocationsizespeed
localscratch~/localscratch750 GBfastest
sharedscratch~/sharedscratch347 TBfast
volatile~/volatile100 TBslowest

The default is localscratch and the value none can also be used to indicate that jobs should be run on the same filesystem that they were submitted from (not usually a good idea if this is your home filestore).

4. Command Line Options

Although there are slight differences between the job submission tools for different applications, they all have the same command line format namely:

$ command_name [options] job_submission_file

This will create a job script file which is the one actually passed to the SLURM scheduler. It may be worth taking a look inside this and possibly using it as template for your own applications. The job script filename will be the the same as the job submission file with a .sh "extension". To get a list of the options available, use -h option:

$ command_name -h

for example:

$ array_submit -h
Usage: array_submit [options] job_submission_file

Options:
  -h, --help            show this help message and exit
  -c INTEGER, --cores_per_job=INTEGER
                        number of cores to run each job task on
  -m MEMORY, --memory_gb=MEMORY
                        amount of memory to allocate to each job task in GB
  -r TIME, --runtime=TIME
                        maximum runtime in hours (e.g. 48h) or days (e.g. 2d)
  -e FILE, --executable=FILE
                        executable to run
  -b FILE, --script=FILE
                        (bash) script to run
  -f FILE(s), --input_files=FILE(s)
                        common input file (or files as comma-separated list
                        e.g.: file1,file2,file3)
  -i FILE(s), --indexed_input_files=FILE(s)
                        indexed input file (or files as comma-separated list
                        e.g.: file1,file2,file3)
  -p FILE(s), --indexed_output_files=FILE(s)
                        indexed output file (or files as comma-separated list
                        e.g.: file1,file2,file3)
  -o FILENAME, --stdout=FILENAME
                        merged standard output and error from job tasks
  -s FILENAME, --indexed_stdout=FILENAME
                        indexed merged standard output and error from job
                        tasks
  -t INTEGER, --total_jobs=INTEGER
                        number of job tasks to run
  -a DIRECTORY, --scratch=DIRECTORY
                        scratch storage to run jobs in:
                        localscratch|sharedscratch|volatile|none (default
                        localscratch)
All [options] are...optional ! Command line options take precedence over
job submission  file attributes.

As stated above, the command line option will overrule any attribute values set in the job submission file. This is useful for making small changes e.g. to the runtime or memory values. The job submission file is mandatory but can be blank if all the options are specified on the command line.

5. MATLAB Applications

Array jobs which make use of MATLAB scripts (M-files) can easily be submitted using the matlab_submit command. To get a complete list of command line options (or the equivalent job submssion file attributes) use

$ matlab_submit -h

The M_file option/attribute (note the underscore) is used to specify the main M-file to be run and any other M-files needed should be given as common input files using the input_files option/attribute.

An example job submission file for a MATLAB application is:

$ cat matlab_example.sub
M_file = main_script.m
indexed_input_files = input.mat
input_files = subfunc1.m, subfunc2.m
indexed_output_files = output.mat
cores_per_job = 4
runtime = 1h
total_jobs = 10

Here main_script.m contains the main MATLAB function and calls functions in subfunc1.m and subfunc2.m. The input data is read from the files input*.mat and the results written to output*.mat. Ten job tasks will be created which will each run on 4 cores with a maximum run time of 1 hour. This would be submitted using:

matlab_submit matlab_example.sub

It is also possible to submit jobs based on pre-built MATLAB standalone executables using the executable option/attribute. To create a MATLAB standalone executable, use the command

$ matlab_build M_file

Where M_file is the main M-file to be used. This will build the exectuable on a compute node as a SLURM job. The executable will have the same name as the M-file minus any "extension" (e.g. .m). Where multiple M-files containing functions called by the main M-file are used, place these in a directory called dependencies below the current working directory. Pre-compiled executables may perform better than M-files for some codes however the author has not seen any noticable speed up on his codes.

6. R Applications

Array jobs which use the R statistics language can be submitted using the r_submit command. To get a complete list of command line options (or the equivalent job submssion file attributes) use:

$ r_submit -h

The R_script option/attribute is used to specify the main R script to be run and any other R scripts that are needed should be given as common input files using the input_files option/attribute.

An example job submission file for a R application is:

$ cat r_example.sub
R_script = main_script.R
indexed_input_files = input.RData
input_files = subfunc1.R, subfunc2.R
indexed_output_files = output.RData
memory_gb = 64
runtime = 1d
total_jobs = 20

Here main_script.m contains the main R code and calls functions in subfunc1.R and subfunc2.R. The input data is read from the files input*.RData and the results written to output*.RData. Twenty job tasks will be created which will be allocated 64 GB of memory each with a maximum run time of 1 day. This would be submitted using:

$ r_submit matlab_example.sub

7. Generic Applications

If you have an executable built perhaps from your own source code this can be used in an array job by using the array_submit command. To get a complete list of command line options (or the equivalent job submssion file attributes) use:

$ array_submit -h

The executable option can be used to specify which binary executable to run (which can include a pathname if necessary). Shell scripts can also be run in this way. The script option can be used to specify a script which will be included in the job script submitted to SLURM (the executable option on the other hand just runs the shell script without including it).

An example job submission file for use with your own executable code is:

$ cat array_example.sub
executable = my_application
indexed_input_files = input.txt
indexed_output_files = output.txt
cores_per_job = 4
memory_gb = 32 
runtime = 36h 
total_jobs = 5 

Here my_application contains the binary executable for your own (presumably multi-threaded) application. The input data will be read from input*.txt and the results written to output*.txt. Five job tasks will be created which will be allocated 32 GB of memory and four cores each with a maximum run time of 36 hours. This would be submitted using:

$ array_submit array_example.sub