[GE users] preemption & scheduling
reuti at staff.uni-marburg.de
Thu Feb 26 22:05:03 GMT 2009
Am 26.02.2009 um 21:51 schrieb yarmond:
> I have been trying to set up preemption using Grid Engine 6.2, and
> have encountered a few problems. I am mostly following the
> directions given on this page:
> under the third scenario. I have two queues set up, long.q and
> normal.q, where jobs in normal.q are allowed to preempt those in
> long.q. I have different problems with jobs, depending on whether
> they are using a PE or not.
> For jobs running in a PE set up for SMP jobs (using allocation rule
> $pe_slots), the problem I am having is that jobs submitted to
> normal.q are sometimes sent to a node already running a job in
> long.q, even when there are idle nodes available. I have tried
> changing the scheduler configuration to use "-slots" as the
> load_formula, but that seems to have no effect at all. On what
> basis is the scheduler choosing which node to allocate, and how can
> I affect that decision?
> For jobs not running in a PE, setting load_formula to "-slots" or
> "slots" behaves predictably, but neither gives me the behavior that
> I actually want. If I have one node with 8 processors running a 8-
> processor job in long.q, and two idle nodes, I would prefer
> additional jobs in normal.q would fill one of the idle nodes, then
> the other. Using "-slots" causes jobs to alternate between the idle
> nodes, and using "slots" causes the preemptible job to be suspended
> while leaving the other two nodes idle.
> Is there any way to configure the scheduler to prefer to fill nodes
> (to leave room for SMP jobs) but not leave idle nodes while still
> allowing preemption? I can provide additional details about my
> configuration if needed. Any help in resolving these scheduling
> woes would be greatly appreciated.
this problem is often solved by the following idea: fill the cluster
from the one side with serial jobs, from the other side with parallel
You can define two queues (one "qtype none" for parallel jobs only),
and change the seq_no for both queues in the opposite way. I.e.
serial jobs will
first use machine 1, then 2. While parallel ones will first use
machine n, and then (n-1) (the scheduler must be set to
Unfortunately it's broken in 6.2u1 but fixed for 6.2u2 already:
To unsubscribe from this discussion, e-mail: [users-unsubscribe at gridengine.sunsource.net].
More information about the gridengine-users