[GE users] Best docs to learn about subordinate queues?

reuti reuti at staff.uni-marburg.de
Tue Dec 22 00:21:13 GMT 2009


Hi,

Am 21.12.2009 um 01:56 schrieb bdbaddog:

> Thanks for the quick response.
> The manual and howto do an adequate job of describing the "how", but
> not the "why"?
> I'm trying to understand when it makes sense to configure a  
> subordinate queue?
>
> So it sounds like you would configure a subordinate queue, when you
> want the superordinate queue to be able to displace the jobs in the
> subordinate queue?

no. They can only be suspended, and they will still use resources. As  
I wrote: to kick them really out of the system, you have to setup a  
checkpointing interface with the "when" condition set to migrate them  
on suspend (means some delay on you benchmark after the job started,  
to be sure they are really gone).

SGE won't take care of already running jobs once they were allowed to  
run. This must be implemented by the user. As within another thread  
on the list: used resources appear used to the scheduler, hence you  
can't request just all memory in a machine or alike when it's already  
used by the low priority jobs. There is no look-ahead in SGE what  
resources would be free if a particular job would be suspended/ 
rescheduled.

Another way to get a free node for benchmarking: setup an EXCL  
complex and attach it to all exechosts (means only one job can run on  
a system). Having this you can submit an Advance Reservation for a  
date in the future requesting this complex. When it's granted, you  
can submit a job into this advance reservation. This means of course,  
that it might not be available right now, but at some point in the  
future only. And all jobs should request a proper h_rt, as otherwise  
the default "infinity" will be used as job duration and the Advance  
Reservation can't be scheduled at all.

-- Reuti 


> Could I use that for the following?
> Normal jobs go to a subordinate queue.
> Every now and then I have some jobs which are being run for
> benchmarking, and they need to take over the whole machine when they
> run, and are allowed to displace any jobs running on the machine (by
> displace I mean, stop them running and swap them out, to be restarted
> once this job is complete)
>
> In general all the docs are great with the "how", but not so great  
> with the why.
> The best "why" or indepth document I've found so far
> SGE_GeorgeTown.pdf file and the bioteam.net slides.
>
> Thanks,
> Bill
>
> On Sun, Dec 20, 2009 at 2:40 PM, reuti <reuti at staff.uni-marburg.de>  
> wrote:
>> Hi,
>>
>> Am 19.12.2009 um 23:14 schrieb bdbaddog:
>>
>>> Can someone point me to a good document describing subordinate  
>>> queues
>>> and when to use them and any other design issues?
>>
>> there are several Howtos: http://gridengine.sunsource.net/howto/
>> howto.html or the manual http://docs.sun.com/app/docs/doc/820-0698
>> page 58.
>>
>> It's often used to have immediate access to slots for high priority
>> jobs and suspend the normal ones in the meantime - but it will
>> suspend the complete subordinated queue on the host where the high
>> priority job is scheduled to. With the nowadays common multi core
>> CPUs it's not advantageous to suspend too many low priority jobs, but
>> a slotwise suspend should be in one of the next SGE releases.
>>
>> As a suspend will still not release any used resources by the low
>> priority jobs, you have to plan accordingly, whether it's suitable
>> for your applications.
>>
>> Are more sophisticated setup also includes to checkpoint the jobs in
>> the low priority queue and migrate them to other machines and/or
>> continue to execute them at a later point in time by a proper
>> definition of a checkpointing interface.
>>
>> -- Reuti
>>
>>
>>> Thanks,
>>> Bill
>>>
>>> ------------------------------------------------------
>>> http://gridengine.sunsource.net/ds/viewMessage.do?
>>> dsForumId=38&dsMessageId=234302
>>>
>>> To unsubscribe from this discussion, e-mail: [users-
>>> unsubscribe at gridengine.sunsource.net].
>>
>> ------------------------------------------------------
>> http://gridengine.sunsource.net/ds/viewMessage.do? 
>> dsForumId=38&dsMessageId=234395
>>
>> To unsubscribe from this discussion, e-mail: [users- 
>> unsubscribe at gridengine.sunsource.net].
>>
>
> ------------------------------------------------------
> http://gridengine.sunsource.net/ds/viewMessage.do? 
> dsForumId=38&dsMessageId=234408
>
> To unsubscribe from this discussion, e-mail: [users- 
> unsubscribe at gridengine.sunsource.net].

------------------------------------------------------
http://gridengine.sunsource.net/ds/viewMessage.do?dsForumId=38&dsMessageId=234536

To unsubscribe from this discussion, e-mail: [users-unsubscribe at gridengine.sunsource.net].



More information about the gridengine-users mailing list