[GE users] Allocation rule behavior

reuti reuti at staff.uni-marburg.de
Thu Aug 20 21:05:01 BST 2009


Am 20.08.2009 um 21:39 schrieb jcd:

> All-
> To reduce job 'segmentation' over a large number of nodes with free
> slots, I setup the allocation rule of the PE ompi to match the  
> number of
>   slots available on each node. (For a parallel job using 4cores,  
> using
> the fill_up or round_robin method will put processes on any available
> slots like for instance 1process on host1, 1process on host2,  
> 2processes
> on host3).
>
> The configuration of a PE:
> # qconf -sp ompitest
> pe_name            ompitest
> slots              302
> user_lists         crc
> xuser_lists        NONE
> start_proc_args    /bin/true
> stop_proc_args     /bin/true
> allocation_rule    4
> control_slaves     TRUE
> job_is_first_task  FALSE
> urgency_slots      min
> accounting_summary FALSE
>
>
> The job submission script is:
> #!/bin/csh
> #$ -pe ompitest 2
> #$ -q *@@nehalem
> module load ompi/1.3.2-intel
> mpirun  -np $NSLOTS hostname
>
>
> The previous job waits for ever in the queue:
> $ qstat -u jducom
> job-ID  prior   name       user         state submit/start at      
> queue
>                           slots ja-task-ID
> ---------------------------------------------------------------------- 
> -------------------------------------------
>    23326 0.51167 openmpi.sh jducom       qw    08/20/2009 15:12:57
>                                2
>
> The reason invoked by the scheduler is the following:
> $ qstat -j 23327
> cannot run in PE "ompitest" because it only offers 0 slots
>
> which is quite surprising as the PE as plenty of slots available.
>
>
>
>
> As soon as the allocation rule is changed to $fill_up, the job is
> scheduled immediately as expected:
> $ qstat -u jducom
> job-ID  prior   name       user         state submit/start at      
> queue
>                           slots ja-task-ID
> ---------------------------------------------------------------------- 
> -------------------------------------------
>    23326 0.51167 openmpi.sh jducom       t     08/20/2009 15:14:39
> long at dqcneh012.crc.nd.edu          2
>
>
> Bottom line: as long as the job requests a number of NSLOTS which is a
> MULTIPLE of the number of slots specified in the allocation rule, the
> job goes thru. If it is not a multiple, it will wait in the queue.
>
> I was wondering if it is the expected behavior.

Yes. Maybe you need a second PE with allocation rule 2.

-- Reuti

> Thank you
>
> JC
>
> ------------------------------------------------------
> http://gridengine.sunsource.net/ds/viewMessage.do? 
> dsForumId=38&dsMessageId=213340
>
> To unsubscribe from this discussion, e-mail: [users- 
> unsubscribe at gridengine.sunsource.net].

------------------------------------------------------
http://gridengine.sunsource.net/ds/viewMessage.do?dsForumId=38&dsMessageId=213344

To unsubscribe from this discussion, e-mail: [users-unsubscribe at gridengine.sunsource.net].



More information about the gridengine-users mailing list