[GE users] Scheduler Configuration

Robert Healey healer at rpi.edu
Tue Dec 23 08:44:16 GMT 2008

Thanks for the quick reply.

reuti wrote:
> Hiho,
> Am 23.12.2008 um 07:10 schrieb Robert Healey:
>> Greetings.
<snip my original>
> when you attach a PE to two queues, you might get a mix of machines  
> anyway. I would suggest to use only one queue for all parallel jobs  
> and use hostgorups to bind 2 PEs to different parts of the cluster:
> pe_list NONE,[@xeons=smp1 mpich1 openmpi1],[@opterons=smp2 mpich2  
> openmpi2]
> and submit with:
> $ qsub -pe "smp*" 4 ...
> and alike. Once a PE is elected for a job, only this will be used.

Ah. At the moment we do "qsub -q xeon -pe openmpi 8" which keeps the 
cross platform issues from occuring.

<snip my original>
> This is the intended behavior. It just allocates slots where they are  
> available.

I'll see the job splitting even when I have fully idle nodes, I'd prefer 
it to only double up when jobs require partial nodes, to try and 
maximize the number of cores busy.

<snip my original>
> To prevent this (as you have more than one queue per node), you will  
> need to define the maximum number of slots in each exechost's  
> configuration, i.e. `qconf -me <nodename>`and set "complex_values  
> slots=8" or 4, depending on the real available

One of my hosts that's currently running eight jobs from the serial 
queue (seq no 45) and the parellel queue (seq no 50).  The parallel job 
started > 12h before the serial jobs did.

hostname              compute-8-15.local
load_scaling          NONE
complex_values        slots=8,brand=intel
user_lists            NONE
xuser_lists           NONE
projects              NONE
xprojects             NONE
usage_scaling         NONE
report_variables      NONE

<snip my original>

> You could in addition supply sequence numbers in opposite direction  
> for the serial and parallel queue, so that the serial jobs will fill  
> the cluster from the one side, and parallel ones from the other side.
> seq_no 0,[node001=1],[node00=2],...
> seq_no 0,[node500=1],[node499=2],...
> -- Reuti
<snip my original>

Bob Healey
Systems Administrator
Physics Department, RPI
healer at rpi.edu


To unsubscribe from this discussion, e-mail: [users-unsubscribe at gridengine.sunsource.net].

More information about the gridengine-users mailing list