[GE users] Resource allocation question

Robert Olson olson at mcs.anl.gov
Wed Dec 12 17:36:22 GMT 2007

I've been poking around the docs for resource management, and would  
like to get advice on the right way to solve a problem (it's not yet  
obvious to me).

We have two applications that make use of a 40-ish node mac cluster  
running linux, currently managed by SGE. We have recently added two 8- 
core systems each of which is the central server for one of the  
applications.  The applications (genome annotation pipelines) spawn a  
fair number of jobs for each user job submitted to them.

The configuration I want to have will start out with some number of  
the jobs staying local to the 8-core systems, then spilling over to   
the cluster when we run out of CPUs locally.  Jobs from one server  
should not run on the other server's 8-core system.

One of the projects has overall higher priority than the other, so  
when both servers are loaded, I want to space-share the cluster with  
one project getting more nodes than the other. However, if one is not  
loaded the other should be able to use the whole cluster (we engineer  
the tasks to take a limited time, 30-60 min each, so that when new  
user jobs arrive the cluster can within that time period adapt to the  
shared mode).

We also need to allow developer access to the cluster; this access  
can likely be to nodes that are already running jobs from the servers.

We currently have three priority levels set up to achieve some of the  
sharing, but this results in one of the applications getting  
completely starved when the other has work to do.

I think the solution to this will include the following pieces (I've  
been drawing on the "Scheduler policies for Job Prioritization in the  
Sun N1 Grid Engine 6 System" document for the basic ideas):

Each application is set up as a SGE project.

Resource quotas to keep jobs from one project from running on the  
other project's 8-way server.

Priorities set for a project to prefer its server to the cluster nodes.

Share-tree set up to implement the fair share of the cluster under load.

I'm suspecting that I might be able to remove the 3-level priority  
from being a core part of the scheduling for the applications (making  
the share-tree being the highest precedence scheduler parameter).

Does  this seem a reasonable approach? Anything big I'm missing here?

One issue that I don't yet know how to solve is the following: The  
jobs in the application are pipelines with a number of stages. the  
pipelines are driven by a periodically run pipeline manager that  
updates the status of each job, submitting new tasks into the cluster  
as previous tasks finish. A characteristic of the current setup is  
that if a lot of jobs are in flight, a later stage of a particular  
user job may end up in the scheduler queue behind early stages of  
newer jobs. In some sense I'd like the next stage of a job to get in  
line ahead of earlier stages of newer jobs. Hm - maybe that is part  
of the solution - assign an explicit priority to each stage of the  
pipeline, and factor that priority into the scheduling within a  

Thanks for any input.

To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
For additional commands, e-mail: users-help at gridengine.sunsource.net

More information about the gridengine-users mailing list