<
High Performance Computing Clusters

Sun Grid Engine (SGE) Batch System



Submitting SGE jobs


Sun Grid Engine is a software tool which allows users to run programs on the cluster's compute nodes as jobs. The simplest job file can consist of just a shell script which contains the command used to run the program. For example:
#!/bin/bash

/path/to/the/folder/executable option1 option2 ...
It is usual to provide the absolute path to the folder/directory containing the executable program so that the program may be run in (and the job submitted from) different folders/directories. If this job file were saved as myscript.sh, then the job could be submitted to the cluster using this command:
$ qsub myscript.sh
Sun Grid Engine should respond in a similar manner to this example:
$ qsub cufflinks.sh
Your job 8314 ("cufflinks.sh") has been submitted
The number (here 8134) is a unique identifier for the job called the job-ID which can be used to track the progress of the job and, if necessary, remove the job.

The qstat command is used to find out the current status of jobs in the queue. For example:

$ qstat
job-ID  prior   name       user         state submit/start at     queue                          slots ja-task-ID 
-----------------------------------------------------------------------------------------------------------------
   8314 0.00000 cufflinks. ian          qw    03/20/2014 11:03:25       
The 'qw' code in the state column indicates that the job is queued and waiting for resources to become available. Once the job starts to run, the state should change to 'r' as shown below:
$ qstat
job-ID  prior   name       user         state submit/start at     queue                          slots ja-task-ID 
-----------------------------------------------------------------------------------------------------------------
   8314 0.55500 cufflinks. ian          r     03/20/2014 11:10:08 multiway.q@comp02.liv.ac.uk        1        
Note that now there is some additional information in the output: namely the number of slots occupied by the job (usually equivalent to the number of cores) and the node(s) on which the job is running. The submit/start column also shows when the job started to run rather than the time at which it was submitted. When the job eventually reaches completion, the job will disapper from the queue and the qstat command will not return anything:
$ qstat
$ 
If you have multiple jobs running (or just submitted and waiting), then information pertinent to each one can be located via the job-ID e.g.
$ qstat
job-ID  prior   name       user         state submit/start at     queue                          slots ja-task-ID 
-----------------------------------------------------------------------------------------------------------------
   8314 0.55500 cufflinks. ian          r     03/20/2014 11:17:37 multiway.q@comp01.liv.ac.uk        1     
   8322 0.55500 otherjob.s ian          r     03/20/2014 11:17:28 multiway.q@comp02.liv.ac.uk        1        
   8323 0.55500 ANotherjob ian          r     03/20/2014 11:17:28 multiway.q@comp03.liv.ac.uk        1        

By default, qstat may only list your own jobs. If this is the case, to see those of others users on the system, the following can be used:
$ qstat -u '*'
Note that the star character (*) must be enclosed in single quotes (' '). You can blame Sun Microsystems for this ugly looking syntax !

Deleting SGE jobs


It is possible to delete jobs that were submitted in error or seem not to be working properly using the qdel command which takes the job-ID as an option. For example:

$ qdel 8134
would delete the job with job-ID 8134. It can take some time before the job finally disappears from the queue and during this period, qstat will use the 'd' code to indicate that the job is being deleted. To delete all of your jobs, use the -u option with your username e.g.
$ qdel -u your_own_login_username
The qdel command will not give you a chance to change your mind and so should be used with care.

Adding SGE options to job files


So far only a very simple job file has been used. In practise, there are a few additions that will make jobs much more usable. The first thing to do is to capture any information that would be normally written to the screen by your program. This is necessary as jobs cannot interact directly with a user sitting at a PC or terminal by writing directly to the screen (this is so-called batch mode rather than interactive mode).

Although it may not be apparent to the casual user, the UNIX operating system divides output written to the screen into two parts or streams . The normal output from a program or command is usually written to the standard output stream (sometimes abbreviated to stdout) and any errors are conventionally written to the standard error stream (sometimes abbreviated to stderr). To capture all of the output (including any error messages), two SGE commands need to be added to the job file as shown in the following example:

#!/bin/bash

#$ -o stdout
#$ -e stderr

/path/to/the/folder/executable option1 option2 ...
Here the standard output will be written to a file called stdout and any error messages (from the standard error stream) to a file called stderr. There is nothing special about these filenames so feel free to use you own. Note that all SGE commands are prefixed with the two characters: #$. It is possible to use the job file as a standard shell script and the shell will just treat the SGE commands as comments and ignore them.

By default, SGE appends the output from different jobs which can cause confusion. To prevent this, the '>' UNIX shell command can be used to create empty stdout and stderr files before the actual program is run, effectively erasing the previous contents. For example:

#!/bin/bash

#$ -o stdout
#$ -e stderr

> stdout
> stdout

/path/to/the/folder/executable option1 option2 ...

Two more additions to the above job file are so useful that they should be considered essential. The -cwd command causes the job to run in the same directory that it was submitted from rather than the default home directory and -V passes all of the environment variables from the shell used to submit the job. The resulting example then looks like:


#!/bin/bash

#$ -o stdout
#$ -e stderr
#$ -cwd
#$ -V

> stdout
> stderr

/path/to/the/folder/executable option1 option2 ...
This example can be used as a template for your own job scripts.

Submitting multi-slot jobs


It is possible to submit jobs that occupy multiple SGE job slots using the-pe (parallel environment) option. This is useful for jobs that require more memory than can be allocated to each core and for jobs that make use of multiple processing threads. For example, if each node has 8 cores (and correspondingly 8 job slots allocated to it), then a job can be made to occupy an entire node using a similar command to:

$ qsub -pe shmem 8
This would cause the job to take up a whole node. The -pe option can also be specifed in the job file using:
#$ -pe shmem 8
It is also possible to use a smaller number of slots. For example, this would cause a job to occupy 4 cores so that two of these jobs could occupy an 8 core compute node:
$ qsub -pe shmem 4
The parallel environment (and other) options are very specific to indvidual systems and the full range of options for the chadwick cluster is given on another page.

Array jobs


If you need to submit a large number of similar jobs, then SGE has a feature called array jobs which can make life easier. An array job consists of a number of independent tasks, each of which executes an independent copy of the job script. The number of tasks to be run is set at job submission time using the -t (task) argument to the qsub command e.g.

qsub -t 1-10 myscript.sh
This command will submit an array job consisting of 10 tasks, numbered 1 to 10. SGE sets a variable called SGE_TASK_ID in the environment of the executing task which can be used within the job script to identify the correct input data to be used by the job task. For example, if we wanted to independently process input data corresponding to 22 different chromosomes, a script such as this could be created:
#$ -o SNPtest.out
#$ -e SNPtest.err
#$ -cwd -V
snptest -data ${SGE_TASK_ID}_Clop_BB_NoACEI.gen *.sample -o ${SGE_TASK_ID}_Clop_BB_NoACEI.gen.Status.out 
and the array job could be submitted using:
qsub -t 1-22 myscript.sh
Here, the results from different chromosomes would be written to uniquely named output files which are numbered in accordance with the input data files.

Array jobs are displayed slightly differently in output of qstat:

$ qstat
job-ID  prior   name       user         state submit/start at     queue                          slots ja-task-ID 
-----------------------------------------------------------------------------------------------------------------
  19559 0.55500 myscript.s ian          r     02/06/2015 10:42:31 multiway.q@comp08.liv.ac.uk        1 1
  19559 0.55500 myscript.s ian          r     02/06/2015 10:42:31 multiway.q@comp04.liv.ac.uk        1 2
  19559 0.55500 myscript.s ian          r     02/06/2015 10:42:31 serial.q@comp01.liv.ac.uk          1 3
  19559 0.55500 myscript.s ian          r     02/06/2015 10:42:31 serial.q@comp01.liv.ac.uk          1 4
  19559 0.55500 myscript.s ian          r     02/06/2015 10:42:31 serial.q@comp01.liv.ac.uk          1 5
  19559 0.55500 myscript.s ian          r     02/06/2015 10:42:31 serial.q@comp01.liv.ac.uk          1 6
  19559 0.55500 myscript.s ian          r     02/06/2015 10:42:31 multiway.q@comp07.liv.ac.uk        1 7
  19559 0.55500 myscript.s ian          r     02/06/2015 10:42:31 multiway.q@comp07.liv.ac.uk        1 8
  19559 0.55500 myscript.s ian          r     02/06/2015 10:42:31 multiway.q@comp07.liv.ac.uk        1 9
  19559 0.55500 myscript.s ian          r     02/06/2015 10:42:31 multiway.q@comp07.liv.ac.uk        1 10
  19559 0.55500 myscript.s ian          r     02/06/2015 10:42:31 multiway.q@comp07.liv.ac.uk        1 11
  19559 0.55500 myscript.s ian          r     02/06/2015 10:42:31 multiway.q@comp07.liv.ac.uk        1 12
  19559 0.55500 myscript.s ian          r     02/06/2015 10:42:31 multiway.q@comp06.liv.ac.uk        1 13
  19559 0.55500 myscript.s ian          t     02/06/2015 10:42:31 multiway.q@comp06.liv.ac.uk        1 14
  19559 0.55500 myscript.s ian          t     02/06/2015 10:42:31 multiway.q@comp06.liv.ac.uk        1 15
  19559 0.55500 myscript.s ian          t     02/06/2015 10:42:31 multiway.q@comp06.liv.ac.uk        1 16
  19559 0.55500 myscript.s ian          t     02/06/2015 10:42:31 multiway.q@comp06.liv.ac.uk        1 17
  19559 0.55500 myscript.s ian          t     02/06/2015 10:42:31 multiway.q@comp06.liv.ac.uk        1 18
  19559 0.55500 myscript.s ian          t     02/06/2015 10:42:31 multiway.q@comp06.liv.ac.uk        1 19
  19559 0.55500 myscript.s ian          t     02/06/2015 10:42:31 multiway.q@comp06.liv.ac.uk        1 20
  19559 0.55500 myscript.s ian          r     02/06/2015 10:42:31 multiway.q@comp05.liv.ac.uk        1 21
  19559 0.55500 myscript.s ian          r     02/06/2015 10:42:31 multiway.q@comp05.liv.ac.uk        1 22

Each running task of an array job is listed in the qstat output in the same way as single serial jobs, with the addition of the task id in the last column of the output. Note that the job-ID of all tasks is the same. All the tasks of an array job, whether running or queued, can be deleted with a single qdel command using the job-ID of the array job e.g.
$ qdel 19559 

Individual tasks of an array job can be deleted with qdel's -t argument. For example;

qdel 19559 -t 3-5
Would delete the jobs analysing chromosomes 3 to 5 inclusive in the previous example.

Running jobs on a specific node


Before leaving the topic of job submission, one final feature is worth noting. What happens if you want jobs to run on a specific node - for example on a node that provides a large amount of memory or disk space. The answer is to use a hard resource request on the command line. For example:
$ qsub -hard -l h=comp00 myscript.sh
Here the job would only be allowed to run on the node labelled comp00. To use another host just change the hostname as necessary.

Interactive jobs


Although SGE is mostly used as a way of running programs which do not interact with the user (batch mode), it is possible to make use of compute nodes in an interactive manner. The following command will achieve this:

$ qrsh
This will log the user into a free compute node just as if the user had logged in with ssh. If none are available at the present time, qrsh will wait until one is. Note that users should NOT use ssh to login to compute nodes to run programs as others users may be running programs on that node at the same time. Even if a node appears to be free, there is nothing to stop another user running a program on that node later (e.g. via qrsh or qsub) and the resulting conflict will be to the detriment of both users. If all users stick to using qsub and qrsh, conflicts should not occur and the environment will be more harmonious for all. Once you have finished using a node, please type exit to quit the qrsh session. This will free the node for other users.