[GE users] Specify the number of nodes when you submit job array

aali ahmaksod at gmail.com
Tue Oct 19 20:08:51 BST 2010

    [ The following text is in the "iso-8859-1" character set. ]
    [ Your display is set for the "ISO-8859-10" character set.  ]
    [ Some special characters may be displayed incorrectly. ]

Imagine my grid has 4 nodes, each with 8 cores, and I would like to submit job array to each core, where each job array has 10 jobs.
So if nodeX has only 3 free cores, and nodeY has only 5 free cores; I want the job array to go and submit 3 and the other 7 to be waiting on machineX and submit 5 jobs to machineY and the other 5 to be waiting for the same machine.

Basically, I am doing this as I want to use local disk for my jobs as they have a lot of I/O and I don't want to hammer the mounted disk, and at the end of each job array, I will update the shared disk.

I tried to use PE this way and it is doing something but not what I am looking for.

qsub -pe threaded 1 -P TEST -p -1 -l node=1 -N job1 -j y -o logFile-cwd -t 1-10:1
qsub -pe threaded 1 -P TEST -p -1 -l node=1 -N job2 -j y -o logFile-cwd -t 21-30:1
qsub -pe threaded 1 -P TEST -p -1 -l node=1 -N job3 -j y -o logFile-cwd -t 31-40:1

I get that all the job arrays are going in series for a single machine, i.e. job1 goes to nodeX only (which is exactly what I want), but job2 doesn't go to nodeY, it waits until job1 finishes. I also tried to changed the number after the variable threaded, but with no difference.

Here is the configuration file for my PE variable threaded:
qconf -sp threaded
pe_name            threaded
slots              8
user_lists         NONE
xuser_lists        NONE
start_proc_args    /bin/true
stop_proc_args     /bin/true
allocation_rule    $pe_slots
control_slaves     FALSE
job_is_first_task  TRUE
urgency_slots      min
accounting_summary FALSE

What am I missing here experts?


On 19 October 2010 18:13, reuti <reuti at staff.uni-marburg.de<mailto:reuti at staff.uni-marburg.de>> wrote:

Am 19.10.2010 um 15:16 schrieb aali:

> Is it possible to specify the number of nodes when you submit job array?
> To make it simple, I want this job array to run on single machine, so I am trying to submit these 10 jobs in the following command:
> qsub -P TEST -p -1 -l node=1 -N jobname -j y -o logFile-cwd -t 1-10:1
> But this doesn't work, so is it possible to control the number of nodes when you submit a job array?

well, if you bind it hard to one node, then you it could be submitted this way. Then you are of course limited to this particular node (i.e. "-l h=node004").

There is nothing in SGE which allows you to specify the number of nodes. But with recent versions of SGE you can use "-tc <int>" to specify the number of instances of your array job and avoid flooding the complete cluster this way.

-- Reuti

> Cheers,
> Ahmed


To unsubscribe from this discussion, e-mail: [users-unsubscribe at gridengine.sunsource.net<mailto:users-unsubscribe at gridengine.sunsource.net>].

More information about the gridengine-users mailing list