[GE users] Scheduling Questions

Chris Dagdigian dag at sonsorol.org
Thu Dec 15 17:28:08 GMT 2005

Hi Ray,

The "large jobs and small jobs running on the same cluster" has been  
a problem that many people have faced.  There *are* solutions and  
some best practices out there but there are a few key concepts you  
need to keep in mind:

  - In its default configuration, SGE 6 does first-in-first-out  
scheduling (FIFO). You need to enable other policies to get fairshare  
or other behavior
  - In its default configuration, SGE will not touch, mess with or  
interfere with running jobs.
  - Since SGE does not mess with running jobs, scheduling policies  
are enforced by manipulating the order of the pending list

The last 2 points are important for you -- SGE only does policy by  
dynamically changing the order the pending list. This means that your  
cluster has to have at least a few job slots draining sort of quickly  
for you to see any sort of real fairshare or other policy in effect.  
Sorting and resorting the order of the pending list will not mean  
anything to you if 100% of your cluster nodes are tied up with a  
single parallel job that will run for days, weeks or months.

Only you know what your workload characteristics will be like -- if  
your "normal" jobs run and exit in a reasonable period of time then  
you can easily set up policies that will suit your departmental  
needs. If your workflow is capable of consuming all available job  
slots for days at a time then you'll need to put in some sort of  
constraint in place that will at least keep a couple nodes or job  
slots free for other work. People do this by (a) forcing parallel  
jobs to only run on a subset of available machines or (b) enforcing  
maxujobs or (c) designating access control lists or special queues  
that remain clear for shorter jobs or non-parallel work.

The two policies that will probably interest you:

  - fairshare by user
  - urgency

Fairshare is easy to do within the SGE "Functional Policy" or "Share  
Tree Policy" mechanisms. There are good docs available for this. One  
HOWTO that I wrote about department based fairshare is available  
here: http://bioteam.net/dag/sge6-funct-share-dept.html

The "urgency" policy was put into SGE 6 specifically to address the  
problem of large paralle jobs taking all the job slots away from  
smaller serial jobs. In short, Urgency policy can prevent "job  
starvation" by increasing the priority of a job based upon how long  
it has spent in the pending list. The longer the job sits in the  
pending queue, the higher it's priority. Eventually the priority  
crosses a threshold where it will vault to the top of the pending  
list and get dispatched into the next available job slots.

The articles here may also be of interest:

As a side note, some of my "day job" work involves configuring Grid  
Engine policies for use with bioinformatics workflows in a  
departmental setting, including many SGE-enabled web applications. If  
you have questions about what works and what does not work for SGE  
and informatics workflows that may not be on-topic for this list feel  
free to contact me directly.


On Dec 15, 2005, at 12:02 PM, Raymond Chan wrote:

> Hi all,
> I know this has been asked before in some way or another, and I'm  
> sorry if
> this is a repeat.  I'm not sure where to begin because as I look  
> through the
> list there are things that may or may not apply, so I'll be more  
> direct and
> just ask.  Thanks to anyone in advance who can help, and again I  
> apologize
> if this was answered recently:
> I have a total of 6 dual cpu nodes that I am submitting parallel  
> mpich & pvm
> jobs (qsub -pe 6) as well as regular non-parallel jobs to.   As you  
> know,
> some jobs take longer than others to complete, so if job #1 is a  
> job that
> will take 5 hours to complete, and job #2 will take only 5 minutes,  
> is there
> a way to automatically in SGE make certain jobs jump in queue over  
> jobs that
> are running a long time so these smaller jobs can finish (sort of  
> in a round
> robin sort of way where each job maybe gets a certain amount of  
> time before
> switching)?  More clearly, a large job #1 holds up the queue and  
> everyone
> behind it is stuck, so what's the best approach at solving this  
> sort of
> scheduling problem?
> I'm trying to run a department website at a university where  
> professors will
> be able to submit SGE jobs via the web for bioinformatics apps that  
> can take
> a long time.  It'd be nice if I had a good way of allowing each  
> professor to
> get a fair share of the cluster instead of one guy holding up the  
> whole
> thing.  Would it also be possible to allow SGE to run two jobs at  
> once on
> the queue rather than one (or would this not be a good or possible
> approach)?
> Thanks again to anyone who can offer advice w/ these newbie  
> questions.  I've
> currently just been able to do jobs one at a time nicely on my web  
> system
> with everyone waiting behind that one job to finish, but I need a nice
> scheduling method, and I have no idea how to configure this in SGE.
> -Ray

To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
For additional commands, e-mail: users-help at gridengine.sunsource.net

More information about the gridengine-users mailing list