[GE users] Scheduler Configuration
reuti at staff.uni-marburg.de
Tue Dec 23 08:32:12 GMT 2008
Am 23.12.2008 um 07:10 schrieb Robert Healey:
> I've been running 6.2 for the past 3 months and am running up
> against a
> wall trying to configure the system for the per node performance that
> pen and paper scheduling gave my users. I currently have two physical
> clusters and 3 queues configured.
> Cluster 1 is 4 racks of dual cpu opterons and Cluster 2 is 4 racks
> of 8
> way xeons. The first queue is configured for multiproc only jobs
> on all
> the opterons, the second for multiproc only jobs on the xeons, and the
> last queue is configured for single proc jobs on rack #4 of the
> and the xeons.
when you attach a PE to two queues, you might get a mix of machines
anyway. I would suggest to use only one queue for all parallel jobs
and use hostgorups to bind 2 PEs to different parts of the cluster:
pe_list NONE,[@xeons=smp1 mpich1 openmpi1],[@opterons=smp2 mpich2
and submit with:
$ qsub -pe "smp*" 4 ...
and alike. Once a PE is elected for a job, only this will be used.
> In my PE configuration, if I set the allocation rule to $fill_up I
> get a
> very inefficient distribution a job that uses a full node's worth of
> cores gets distributed across at least two nodes. This is less of an
> issue on the opterons than the xeons. Its not an even split
> either, but
> 5/3, 7/1, 6/2, etc. very rarely 4/4.
This is the intended behavior. It just allocates slots where they are
> If I set the allocation rule for
> the xeon PE to 8, on xeon rack #4 I end up with the 8 mpi threads
> for a
> PE job on a node and 8 single core jobs also on the same node,
> in very poor performance.
To prevent this (as you have more than one queue per node), you will
need to define the maximum number of slots in each exechost's
configuration, i.e. `qconf -me <nodename>`and set "complex_values
slots=8" or 4, depending on the real available
> Maybe I can't mix serial and parallel queues on the same node, but
> if I
> can, if anyone has some pointers on how to straighten this out without
> reverting back to using pen/paper to assign nodes to researchers, it
> would be appreciated.
> Thank you very much.
> qconf -msched:
> algorithm default
> schedule_interval 0:0:15
> maxujobs 0
> queue_sort_method seqno
You could in addition supply sequence numbers in opposite direction
for the serial and parallel queue, so that the serial jobs will fill
the cluster from the one side, and parallel ones from the other side.
> job_load_adjustments NONE
> load_adjustment_decay_time 0:0:0
> load_formula slots
> schedd_job_info true
> Bob Healey
> Systems Administrator
> Physics Department, RPI
> healer at rpi.edu
> To unsubscribe from this discussion, e-mail: [users-
> unsubscribe at gridengine.sunsource.net].
To unsubscribe from this discussion, e-mail: [users-unsubscribe at gridengine.sunsource.net].
More information about the gridengine-users