[GE users] scheduling strategy

Jan Sundermeyer jan.sundermeyer at iis.fraunhofer.de
Wed Nov 28 09:12:29 GMT 2007


    [ The following text is in the "ISO-8859-1" character set. ]
    [ Your display is set for the "ISO-8859-10" character set.  ]
    [ Some special characters may be displayed incorrectly. ]

Andreas.Haas at Sun.COM schrieb:
> Hi Jan,
> 
> On Tue, 27 Nov 2007, Jan Sundermeyer wrote:
> 
>> Hello,
>>
>> we have installed sge 6.1 with a standard strategy, which means:
>>
>> There is a high/normal/low-priority queue.
>> Lower priority queues are suspended when the higher priorities load the
>> machines completely. It does not work perfectly as the use of
>> subordinates only suspends complete queues but it should be okay.
>>
>> My problem now is that actually i would like to have a different set-up
>> which does not need priority queues.
>> The target would be fair utilization of the machines.
>>
>> For example: 2 users start a number of jobs
>> every users should get the same number of concurrently running jobs.
>>
>> If they start their jobs the same time, fine.
>> But if one starts it before the other, the queues are full and the next
>> user has to wait until the jobs are finished.
> 
> With functional and/or share tree ticket policy you can implement this
> (witout separate queues), but that doesn't help you to achieve preemption.
> That means users must wait until resources become available again. That
> seems nasty, but resource quotas could be used to mitigate the problem.
> E.g. if you have 200 slots in your SGE cluster, you could configure a
> resource quota limit like
> 
>   limit users {*} to slots=150
> 
> to prevent users grabbing more than 75% at a time.
> 
>>
>> I would prefer if sge makes room for the new jobs by suspending jobs
>> from the first user, so that a fair share is reached as soon as possible.
>> This way any user could start as many jobs as he likes and limitations
>> come up only if other users need resources as well.
>>
>> One way of doing this might be the use of checkpoint after time periods,
>> which with our simulator (spectre) leads pratically to reschedule of the
>> jobs. However it takes some time for simulator to recover to the last
>> state. Therefore it would be preferable to do checkpointing only when it
>> is necessary.
> 
> Getting the checkpointing be done only when needed is just one part of
> the solution. Yet the above reads as if you intend to enforce fair
> share. How will you be doing this? Are you thinking of a co-scheduler
> that triggers job preemption?
> 
Hi Andreas,

i was thinking of a co-scheduler.
How is this best implemented ?

Regards,
  Jan



> Regards,
> Andreas
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
> For additional commands, e-mail: users-help at gridengine.sunsource.net


-- 

******************************************************************************

Jan Sundermeyer

Fraunhofer-Institut für Integrierte Schaltungen
Projektgruppe Optische Kommunikationstechnik
Am Wolfsmantel 33
D-91058 Erlangen
Germany

phone (49) 9131/776-9214     Fax.: -499     e-mail:
Jan.Sundermeyer at iis.fraunhofer.de

Signierungszertifikat: http://pki.fraunhofer.de

http://www.iis.fhg.de/asic

******************************************************************************

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
For additional commands, e-mail: users-help at gridengine.sunsource.net




More information about the gridengine-users mailing list