[GE users] Scheduling Setup

Charu Chaubal Charu.Chaubal at Sun.COM
Fri Jan 13 20:21:51 GMT 2006


Hi Brady,

Brady Catherman wrote On 01/13/06 12:04,:
> Nono! SGE is awesome! It works well for us.. Our problem is that we  
> have a very wide assortment of programs that are running and that  
> leads to user grief. We have some programs that run for months and  
> consume 1 slot and others that run for minutes and consume 100. Our  
> issue right now is that programs want to consume entire nodes but we  
> have just enough long running jobs peppered through the cluster to  
> prevent much of that form happening.
> 

This is a common enough problem, and it seems that physically
partitioning the hosts according to job time limits like you describe
below is probably the most practical solution.  This is how several
real-world sites do it.

> What I was thinking about doing was setting up several queues.
> 1/4 of the cluster can run jobs with no maximum time.
> 1/4 of the cluster can run jobs with a 7 day deadline.
> and 1/2 of the cluster can run jobs that will run at most 24 hours.
> Jobs would also be allowed to run in "longer" queues if they are  
> available.
> 

Rather than having separate queues for this, I would suggest keeping
whatever queues you have today, but use per-hostgroup values for various
queue parameters. EG, your queue setup would have a line that looks
something like:

h_rt                  INFINITY,[@sevenday=168:0:0],[@oneday=24:0:0]

where you have defined @sevenday and @oneday as hostgroups that contain
hosts allowed to run jobs for seven days and one day, respectively.
(This, of course, assumes you are using 'h_rt' to manage job durations
--- if you are using another, custom resources, you would apply this
concept similarly).

In this way, you can keep the logical separation of hosts independent
from the physical separation.  EG, you can move a host from @sevenday
into @oneday without touching the queue config or anything else.

Various ways of using Hostgroups and Queues are described in this Blueprint:

http://www.sun.com/blueprints/0805/819-3165.html

Hope that helps.

Regards,
	Charu

> This cleans up the processes that are running and tries to keep nodes  
> free for the snort benchmark like programs.
> 
> Before I can tweak with things though I have to find out what other  
> people are doing to solve scheduling problems.
> 
> We would like to keep our system convenient for users whils still  
> allowing for lots of flexibility from our power users. (We have  
> parallel programming researchers, bioinformatics users as well as  
> graphics and rendering users.. Finding a happy balance that allows  
> all these environments to live together is harder than you would  
> think =)
> 
> 
> 
> 
> On Jan 13, 2006, at 11:43 AM, Reuti wrote:
> 
> 
>>Hi,
>>
>>Am 13.01.2006 um 18:30 schrieb Brady Catherman:
>>
>>
>>>We have three general purpose clusters for use by researchers here  
>>>at the University of Idaho.
>>>
>>>We are looking to re-configure the queuing priorities in order to  
>>>allow our users to make better use of the cluster. Before we do  
>>>this though we want to get feed back about what other cluster  
>>>admins are doing around the country. If you have a minute, can you  
>>>send me a brief overview of your setup? We run a wide range of  
>>>programs.. some run for months while others run for days.
>>
>>do you have any specific goal for reconfiguring the cluster? Did  
>>you sum up the loads of the machines and get a utilization of only  
>>50% or are unhappy with SGE's behavior in general like seeing  
>>idling nodes and want to prevent this?
>>
>>-- Reuti
>>
>>
>>>Thank you for your time =)
>>>
>>>---------------------------------------------------------------------
>>>To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
>>>For additional commands, e-mail: users-help at gridengine.sunsource.net
>>
>>---------------------------------------------------------------------
>>To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
>>For additional commands, e-mail: users-help at gridengine.sunsource.net
>>
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
> For additional commands, e-mail: users-help at gridengine.sunsource.net
> 

-- 
####################################################################
# Charu V. Chaubal              # Phone: (650) 786-7672 (x87672)   #
# Grid Computing Technologist   # Fax:   (650) 786-4591            #
# Sun Microsystems, Inc.        # Email: charu.chaubal at sun.com     #
####################################################################

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
For additional commands, e-mail: users-help at gridengine.sunsource.net




More information about the gridengine-users mailing list