[GE users] preemption & scheduling

reuti reuti at staff.uni-marburg.de
Thu Feb 26 22:05:03 GMT 2009


Hi,

Am 26.02.2009 um 21:51 schrieb yarmond:

> I have been trying to set up preemption using Grid Engine 6.2, and  
> have encountered a few problems. I am mostly following the  
> directions given on this page:
> http://wikis.sun.com/display/BluePrints/Scheduler+Policies+for+Job 
> +Prioritization+in+the+N1+Grid+Engine+6+System
> under the third scenario. I have two queues set up, long.q and  
> normal.q, where jobs in normal.q are allowed to preempt those in  
> long.q. I have different problems with jobs, depending on whether  
> they are using a PE or not.
>
> For jobs running in a PE set up for SMP jobs (using allocation rule  
> $pe_slots), the problem I am having is that jobs submitted to  
> normal.q are sometimes sent to a node already running a job in  
> long.q, even when there are idle nodes available. I have tried  
> changing the scheduler configuration to use "-slots" as the  
> load_formula, but that seems to have no effect at all. On what  
> basis is the scheduler choosing which node to allocate, and how can  
> I affect that decision?
>
> For jobs not running in a PE, setting load_formula to "-slots" or  
> "slots" behaves predictably, but neither gives me the behavior that  
> I actually want. If I have one node with 8 processors running a 8- 
> processor job in long.q, and two idle nodes, I would prefer  
> additional jobs in normal.q would fill one of the idle nodes, then  
> the other. Using "-slots" causes jobs to alternate between the idle  
> nodes, and using "slots" causes the preemptible job to be suspended  
> while leaving the other two nodes idle.
>
> Is there any way to configure the scheduler to prefer to fill nodes  
> (to leave room for SMP jobs) but not leave idle nodes while still  
> allowing preemption? I can provide additional details about my  
> configuration if needed. Any help in resolving these scheduling  
> woes would be greatly appreciated.

this problem is often solved by the following idea: fill the cluster  
from the one side with serial jobs, from the other side with parallel  
ones:

You can define two queues (one "qtype none" for parallel jobs only),  
and change the seq_no for both queues in the opposite way. I.e.  
serial jobs will
first use machine 1, then 2. While parallel ones will first use  
machine n, and then (n-1) (the scheduler must be set to  
"queue_sort_method seqno").

Unfortunately it's broken in 6.2u1 but fixed for 6.2u2 already:

http://gridengine.sunsource.net/issues/show_bug.cgi?id=2864

-- Reuti

------------------------------------------------------
http://gridengine.sunsource.net/ds/viewMessage.do?dsForumId=38&dsMessageId=115507

To unsubscribe from this discussion, e-mail: [users-unsubscribe at gridengine.sunsource.net].



More information about the gridengine-users mailing list