[GE users] scheduling strategy

Andreas.Haas at Sun.COM Andreas.Haas at Sun.COM
Tue Nov 27 16:07:49 GMT 2007

Hi Jan,

On Tue, 27 Nov 2007, Jan Sundermeyer wrote:

> Hello,
> we have installed sge 6.1 with a standard strategy, which means:
> There is a high/normal/low-priority queue.
> Lower priority queues are suspended when the higher priorities load the
> machines completely. It does not work perfectly as the use of
> subordinates only suspends complete queues but it should be okay.
> My problem now is that actually i would like to have a different set-up
> which does not need priority queues.
> The target would be fair utilization of the machines.
> For example: 2 users start a number of jobs
> every users should get the same number of concurrently running jobs.
> If they start their jobs the same time, fine.
> But if one starts it before the other, the queues are full and the next
> user has to wait until the jobs are finished.

With functional and/or share tree ticket policy you can implement this 
(witout separate queues), but that doesn't help you to achieve preemption.
That means users must wait until resources become available again. That
seems nasty, but resource quotas could be used to mitigate the problem.
E.g. if you have 200 slots in your SGE cluster, you could configure a 
resource quota limit like

   limit users {*} to slots=150

to prevent users grabbing more than 75% at a time.

> I would prefer if sge makes room for the new jobs by suspending jobs
> from the first user, so that a fair share is reached as soon as possible.
> This way any user could start as many jobs as he likes and limitations
> come up only if other users need resources as well.
> One way of doing this might be the use of checkpoint after time periods,
> which with our simulator (spectre) leads pratically to reschedule of the
> jobs. However it takes some time for simulator to recover to the last
> state. Therefore it would be preferable to do checkpointing only when it
> is necessary.

Getting the checkpointing be done only when needed is just one part of 
the solution. Yet the above reads as if you intend to enforce fair share. 
How will you be doing this? Are you thinking of a co-scheduler that triggers 
job preemption?


To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
For additional commands, e-mail: users-help at gridengine.sunsource.net

More information about the gridengine-users mailing list