[GE users] allocation rule

templedf dan.templeton at sun.com
Thu May 14 14:13:58 BST 2009


The higher the sequence number, the later it's scheduled to.  Setting 
the 1cpu.q to seq_no=1 will make it the first choice for jobs.  The 
solution doesn't, however, prevent the jobs from being spread out evenly 
within each host group.  Here are two other options:

o Reverse the load_formula so that it schedules all jobs to the most 
loaded host, rather than the least.
o Set the job_load_adjustments to NONE, which will effectively cause the 
schedule to use a fill-up approach.

The problem with both these options is that it affects the entire 
cluster, so all jobs in the cluster will be scheduled that way.

A pseudo-solution is to split the cluster into two queues, one for small 
jobs and one for large, and then use soft resource requests to direct 
the jobs to queues.  Because the resource requests are soft, if a queue 
is full, then all jobs will just go to the other queue.  While the 
cluster is not overloaded, though, it will help keep things separated.

Of course, as Chris said, resource reservation might just be your answer.

Daniel

craffi wrote:
> Couple of different options. Sounds like you are experiencing parallel  
> job starvation. In modern versions of SGE this is usually handled by  
> reservations (submitting your parallel jobs with the "-R y") option  
> set).
>
> I'd try the reservation methods first.
>
> If those don't work there are a few other methods that come to mind.  
> One of them involves changing your queue sort method from load-based  
> to "seqno". The queue sorting method kicks in whenever there are more  
> than one queue instances available that satisfy your pending job needs  
> in a given scheduling interval By default these available queue  
> instances are sorted by their reported load so that SGE sends your job  
> to "... the least busy machine capable of satisfying your job  
> requirements".
>
> Changing to seqno sorting allows you to sort your queues differently  
> when more than one is free and available. In this case I'd recommend  
> making host groups of your 1,2,4 and 8 processor machines and using  
> seqno sorting to fill up the machines from smallest CPU count to  
> largest.
>
> Basically give all your 1-CPU nodes a seqno of "1", your 2-CPU a seqno  
> of "10", 4-CPU gets "100" and 8-CPU gets "1000". Then switch your  
> queue sort method to seqno.  The actual integers you choose for seqno  
> value means nothing, all that matters is who wins the sort.
>
> Been ages since I've done this for real so I may have reversed the  
> sort requirements - maybe your 1-CPU nodes need the higher seqno, not  
> sure. Just test and see.
>
> -Chris
>
>
>
> On May 13, 2009, at 5:28 PM, davemeni wrote:
>
>   
>> Hello all,
>>
>> We have a large cluster with 8 processor nodes on it, running SGE  
>> 6.0.  We run 1, 2 , 4, and 8 processor jobs on the cluster.  We have  
>> been trying to find a way to tell the queue to submit 1 processor  
>> jobs to nodes with the fewest free processors but have come up empty  
>> handed.  The problem now is that we have a ton of 1 processor jobs  
>> spread across a bunch of nodes.  What we need is all of the 1  
>> processor jobs on the same few nodes so that there can be empty  
>> nodes available for 4 and 8 processor jobs.  Is there a PE  
>> environment option for filling up the most used first instead of the  
>> least used?
>>
>> I know that some would just suggest to create separate clusters for  
>> serial and parallel jobs, however the number of serial and parallel  
>> jobs running at any given instant can vary greatly, so designating  
>> nodes for only parallel or only serial will create a lot of unused  
>> resources.
>>
>> Ideally we would like the queue to submit a job to the highest  
>> occupied node that has enough free processors to not split the job  
>> across nodes.  However, I know that this is probably not an option.   
>> If it is then by all means please share how.  If not, please let me  
>> know if it is possible to get 1 processor jobs to fill the highest  
>> occupied nodes.
>>
>> Thanks,
>> Dave Meninger
>> University of Delaware
>>
>> ------------------------------------------------------
>> http://gridengine.sunsource.net/ds/viewMessage.do?dsForumId=38&dsMessageId=195189
>>
>> To unsubscribe from this discussion, e-mail: [users-unsubscribe at gridengine.sunsource.net 
>> ].
>>     
>
> ------------------------------------------------------
> http://gridengine.sunsource.net/ds/viewMessage.do?dsForumId=38&dsMessageId=195468
>
> To unsubscribe from this discussion, e-mail: [users-unsubscribe at gridengine.sunsource.net].
>

------------------------------------------------------
http://gridengine.sunsource.net/ds/viewMessage.do?dsForumId=38&dsMessageId=195510

To unsubscribe from this discussion, e-mail: [users-unsubscribe at gridengine.sunsource.net].



More information about the gridengine-users mailing list