[GE users] SGE-6.2u5: Slot reservation for different queues?

reuti reuti at staff.uni-marburg.de
Mon Jul 12 21:38:57 BST 2010


Am 12.07.2010 um 22:36 schrieb reuti:

> Am 12.07.2010 um 08:25 schrieb soyez:
>
>> On Thu, 8 Jul 2010, reuti wrote:
>>
>>> Am 08.07.2010 um 09:42 schrieb soyez:
>>>
>>>> Thanks Reuti for your reply,
>>>>
>>>> yes, max_reservation is set of course, as reservation works fine
>>>> with
>>>> parallel jobs only.
>>>>
>>>> There are several differences between serial and parallel queue
>>>> (limits
>>>> etc.) but the main difference are sequence numbers in opposite
>>>> directions
>>>> in order to implement some kind of "fill up policy" for single cpu
>>>> jobs,
>>>> whereas parallel jobs are supposed to use different nodes first.
>>>> I don't
>>>> know of any other way to achieve this.
>>>
>>> Fine. This is the way to go, to fill the cluster from both sides.
>>>
>>> You mean, that slots seems to be reserved from the parallel queue,
>>> but
>>> serial job from the other queue can always slip in?
>>
>> Yes, correct.
>>
>>> The total amount of slots from all queues you limited by an entry in
>>> the exechost definition or an RQS I assume?
>>
>> Yes, exechost definitions.
>>
>> By the way, I forgot to mention that users have to specify a runtime
>> for every job.  But according to my calculations there should have
>> been
>> no backfilling for those jobs.  Do you know of any scheduling
>> parameter
>> to switch off backfilling completely, that might be worth trying.
>
> Was it mentioned already: which version of SGE are you running?

Okay, okay, it's obviously late...

>
> For me it looks working, the serial job gets in 6.2u5:
>
> ...cannot run at host...because it offers only hc:slots=0.000000 due
> to a reservation
>
> Do you use any RQS?
>
> -- Reuti
>
>
>> Erik Soyez.
>>
>>
>>>>> Am 07.07.2010 um 18:43 schrieb reuti:
>>>>>
>>>>>> Hi,
>>>>>>
>>>>>> Am 07.07.2010 um 09:24 schrieb soyez:
>>>>>>
>>>>>>> we seem to have a problem with slot reservation only working
>>>>>>> for jobs in the same queue.  We have one queue ("batch") for
>>>>>>> parallel jobs and another one ("serial") for single cpu jobs.
>>>>>>>
>>>>>>> Large parallel jobs (>=32 slots) are submitted with "-R yes"
>>>>>>> and this works fine in normal circumstances when competing
>>>>>>> with small parallel jobs.
>>>>>>>
>>>>>>> Right now the cluster is full with single cpu jobs and all the
>>>>>>> parallel jobs in queue "batch" are starving while being bypassed
>>>>>>> in the queue "serial".
>>>>>>
>>>>>> is there any urgency set up for the serial queue?
>>>>>>
>>>>>>> Is this the intended behaviour or is it just some kind of
>>>>>>> misconfiguration?
>>>>>>
>>>>>> One necessary parameter is:
>>>>>>
>>>>>> $ qconf -sconf
>>>>>
>>>>> Ups: qconf -ssconf
>>>>>
>>>>>> ...
>>>>>> max_reservation 20
>>
>>
>>
>>
>>
>>
>>
>>
>>
>> --  
>> Vorstand/Board of Management:
>> Dr. Bernd Finkbeiner, Dr. Roland Niemeier,
>> Dr. Arno Steitz, Dr. Ingrid Zech
>> Vorsitzender des Aufsichtsrats/
>> Chairman of the Supervisory Board:
>> Michel Lepert
>> Sitz/Registered Office: Tuebingen
>> Registergericht/Registration Court: Stuttgart
>> Registernummer/Commercial Register No.: HRB 382196
>>
>> ------------------------------------------------------
>> http://gridengine.sunsource.net/ds/viewMessage.do?dsForumId=38&dsMessageId=267456
>>
>> To unsubscribe from this discussion, e-mail: [users-unsubscribe at gridengine.sunsource.net
>> ].
>
> ------------------------------------------------------
> http://gridengine.sunsource.net/ds/viewMessage.do?dsForumId=38&dsMessageId=267610
>
> To unsubscribe from this discussion, e-mail: [users-unsubscribe at gridengine.sunsource.net 
> ].

------------------------------------------------------
http://gridengine.sunsource.net/ds/viewMessage.do?dsForumId=38&dsMessageId=267613

To unsubscribe from this discussion, e-mail: [users-unsubscribe at gridengine.sunsource.net].



More information about the gridengine-users mailing list