[GE users] preemption & scheduling

reuti reuti at staff.uni-marburg.de
Fri Feb 27 00:16:48 GMT 2009

Am 27.02.2009 um 00:56 schrieb yarmond:

>> this problem is often solved by the following idea: fill the cluster
>> from the one side with serial jobs, from the other side with parallel
>> ones:
>> You can define two queues (one "qtype none" for parallel jobs only),
>> and change the seq_no for both queues in the opposite way. I.e.
>> serial jobs will
>> first use machine 1, then 2. While parallel ones will first use
>> machine n, and then (n-1) (the scheduler must be set to
>> "queue_sort_method seqno").
>> Unfortunately it's broken in 6.2u1 but fixed for 6.2u2 already:
>> http://gridengine.sunsource.net/issues/show_bug.cgi?id=2864
>> -- Reuti
> I can see that this method at least gives a clear ordering for the  
> assignment of nodes for parallel jobs, but I don't see how it will  
> improve the preemption behavior.
> For example: say the cluster is first filled with jobs in the long 
> +parallel queue. Some of those jobs finish, but not job running on  
> the host corresponding to the first seq_no for the normal+parallel  
> queue. Wouldn't a job submitted to the normal+parallel queue be  
> dispatched to this node, forcing the job running there to be  
> preempted despite of the existence of free nodes?

Aha, so you face two fighting problems at once:

- keep nodes free for PE SMP jobs (serial jobs from one side,  
parallel jobs from the other side)

but OTOH:

- use free nodes for normal jobs to avoid suspension of long jobs  
(normal jobs from the one side, long jobs from the other side)


Maybe you need a pool of 4 hostgroups for each jobtype and order them  
by seqno:

normal-serial 1-2-3-4
long-serial 2-1-3-4
normal-parallel 4-3-1-2
long-parallel 3-4-1-2

-- Reuti


To unsubscribe from this discussion, e-mail: [users-unsubscribe at gridengine.sunsource.net].

More information about the gridengine-users mailing list