[GE users] Scheduler Configuration

reuti reuti at staff.uni-marburg.de
Tue Dec 23 09:36:32 GMT 2008


Am 23.12.2008 um 09:44 schrieb Robert Healey:

> Thanks for the quick reply.
>
> reuti wrote:
>> Hiho,
>>
>> Am 23.12.2008 um 07:10 schrieb Robert Healey:
>>
>>> Greetings.
>>>
> <snip my original>
>>
>> when you attach a PE to two queues, you might get a mix of machines
>> anyway. I would suggest to use only one queue for all parallel jobs
>> and use hostgorups to bind 2 PEs to different parts of the cluster:
>>
>> pe_list NONE,[@xeons=smp1 mpich1 openmpi1],[@opterons=smp2 mpich2
>> openmpi2]
>>
>> and submit with:
>>
>> $ qsub -pe "smp*" 4 ...
>>
>> and alike. Once a PE is elected for a job, only this will be used.
>
> Ah. At the moment we do "qsub -q xeon -pe openmpi 8" which keeps the
> cross platform issues from occuring.
>
>>
> <snip my original>
>>
>> This is the intended behavior. It just allocates slots where they are
>> available.
>
> I'll see the job splitting even when I have fully idle nodes, I'd  
> prefer
> it to only double up when jobs require partial nodes, to try and
> maximize the number of cores busy.
>
>>
> <snip my original>
>>
>> To prevent this (as you have more than one queue per node), you will
>> need to define the maximum number of slots in each exechost's
>> configuration, i.e. `qconf -me <nodename>`and set "complex_values
>> slots=8" or 4, depending on the real available
>
> One of my hosts that's currently running eight jobs from the serial
> queue (seq no 45) and the parellel queue (seq no 50).  The parallel  
> job
> started > 12h before the serial jobs did.
>
> hostname              compute-8-15.local
> load_scaling          NONE
> complex_values        slots=8,brand=intel
> user_lists            NONE
> xuser_lists           NONE
> projects              NONE
> xprojects             NONE
> usage_scaling         NONE
> report_variables      NONE

Is "qhost -F" showing negative values for the slots entry?

-- Reuti


>
>>
>>
> <snip my original>
>
>> You could in addition supply sequence numbers in opposite direction
>> for the serial and parallel queue, so that the serial jobs will fill
>> the cluster from the one side, and parallel ones from the other side.
>>
>> seq_no 0,[node001=1],[node00=2],...
>>
>> seq_no 0,[node500=1],[node499=2],...
>>
>> -- Reuti
>>
>>
> <snip my original>
>
> -- 
> Bob Healey
> Systems Administrator
> Physics Department, RPI
> healer at rpi.edu
>
> ------------------------------------------------------
> http://gridengine.sunsource.net/ds/viewMessage.do? 
> dsForumId=38&dsMessageId=94016
>
> To unsubscribe from this discussion, e-mail: [users- 
> unsubscribe at gridengine.sunsource.net].

------------------------------------------------------
http://gridengine.sunsource.net/ds/viewMessage.do?dsForumId=38&dsMessageId=94023

To unsubscribe from this discussion, e-mail: [users-unsubscribe at gridengine.sunsource.net].



More information about the gridengine-users mailing list