[GE users] Allocation rule behavior

jcd jcducom at gmail.com
Thu Aug 20 21:26:34 BST 2009


Reuti and Dan-
Thanks for the feedback. Indeed the goal is to make ompi-2way, 
ompi-4way, etc... but I wanted to keep the number of PE minimal i.e. 
less confusing for users
Thanks again
JC

reuti wrote:
> Am 20.08.2009 um 21:39 schrieb jcd:
> 
>> All-
>> To reduce job 'segmentation' over a large number of nodes with free
>> slots, I setup the allocation rule of the PE ompi to match the  
>> number of
>>   slots available on each node. (For a parallel job using 4cores,  
>> using
>> the fill_up or round_robin method will put processes on any available
>> slots like for instance 1process on host1, 1process on host2,  
>> 2processes
>> on host3).
>>
>> The configuration of a PE:
>> # qconf -sp ompitest
>> pe_name            ompitest
>> slots              302
>> user_lists         crc
>> xuser_lists        NONE
>> start_proc_args    /bin/true
>> stop_proc_args     /bin/true
>> allocation_rule    4
>> control_slaves     TRUE
>> job_is_first_task  FALSE
>> urgency_slots      min
>> accounting_summary FALSE
>>
>>
>> The job submission script is:
>> #!/bin/csh
>> #$ -pe ompitest 2
>> #$ -q *@@nehalem
>> module load ompi/1.3.2-intel
>> mpirun  -np $NSLOTS hostname
>>
>>
>> The previous job waits for ever in the queue:
>> $ qstat -u jducom
>> job-ID  prior   name       user         state submit/start at      
>> queue
>>                           slots ja-task-ID
>> ---------------------------------------------------------------------- 
>> -------------------------------------------
>>    23326 0.51167 openmpi.sh jducom       qw    08/20/2009 15:12:57
>>                                2
>>
>> The reason invoked by the scheduler is the following:
>> $ qstat -j 23327
>> cannot run in PE "ompitest" because it only offers 0 slots
>>
>> which is quite surprising as the PE as plenty of slots available.
>>
>>
>>
>>
>> As soon as the allocation rule is changed to $fill_up, the job is
>> scheduled immediately as expected:
>> $ qstat -u jducom
>> job-ID  prior   name       user         state submit/start at      
>> queue
>>                           slots ja-task-ID
>> ---------------------------------------------------------------------- 
>> -------------------------------------------
>>    23326 0.51167 openmpi.sh jducom       t     08/20/2009 15:14:39
>> long at dqcneh012.crc.nd.edu          2
>>
>>
>> Bottom line: as long as the job requests a number of NSLOTS which is a
>> MULTIPLE of the number of slots specified in the allocation rule, the
>> job goes thru. If it is not a multiple, it will wait in the queue.
>>
>> I was wondering if it is the expected behavior.
> 
> Yes. Maybe you need a second PE with allocation rule 2.
> 
> -- Reuti
> 
>> Thank you
>>
>> JC
>>
>> ------------------------------------------------------
>> http://gridengine.sunsource.net/ds/viewMessage.do? 
>> dsForumId=38&dsMessageId=213340
>>
>> To unsubscribe from this discussion, e-mail: [users- 
>> unsubscribe at gridengine.sunsource.net].
> 
> ------------------------------------------------------
> http://gridengine.sunsource.net/ds/viewMessage.do?dsForumId=38&dsMessageId=213344
> 
> To unsubscribe from this discussion, e-mail: [users-unsubscribe at gridengine.sunsource.net].
>

------------------------------------------------------
http://gridengine.sunsource.net/ds/viewMessage.do?dsForumId=38&dsMessageId=213348

To unsubscribe from this discussion, e-mail: [users-unsubscribe at gridengine.sunsource.net].



More information about the gridengine-users mailing list