[GE users] Allowing parallel jobs to run on busy cluster
dag at sonsorol.org
Wed Dec 3 22:34:38 GMT 2008
"parallel job starvation" used to be a problem on older SGE systems as
faster non-parallel jobs would zip in and out of job slots during each
scheduling interval while never leaving enough free slots for the
larger parallel job.
I think this has been mostly solved now in recent versions of SGE via
the use of reservation based scheduling. People submit the parallel
job with a reservation request (-R y ) and SGE will slowly start to
hold back job slots until there are enough free.
Hopefully someone will correct me ASAP if I got the above bit wrong.
Back in the day I recall dealing with this issue via the
wait_wait_time Urgency sub-policy but I don't think that hack is
On Dec 3, 2008, at 5:23 PM, Sean Davis wrote:
> We have a small cluster that consists of several SMP machines. We
> have been running into the situation that many serial jobs have been
> submitted (and are in the queue) and a user wants to run an 8-process
> parallel process on a single SMP machine. Is it possible without
> setting arbitrary resource limits on particular users (those
> submitting the serial jobs) to give priority to the parallel job?
> Otherwise, what happens is that the serial jobs always take precedence
> until there are enough processors open on a single machine to run the
> parallel job (the PE is set up as $pe_slots).
To unsubscribe from this discussion, e-mail: [users-unsubscribe at gridengine.sunsource.net].
More information about the gridengine-users