[GE users] resource reservation not working

Andreas.Haas at Sun.COM Andreas.Haas at Sun.COM
Fri Sep 21 12:30:42 BST 2007


Hi Ross,

On Thu, 20 Sep 2007, Ross Dickson wrote:

> Hi Daniel.
>
> It does slightly resemble 2344, now that you mention it.  We have two queues 
> on
> this cluster:  One for parallel jobs, and a subordinate queue for serial 
> jobs.

That shouldn't have any impact, except if the parallel queue were 
subordinated to the serial one.

>
> Does 2344 imply that reservation won't work properly for any site with 
> multiple
> cluster queues?  That seems a bit unlikely.  Hey, list readers, is anyone out
> there running a site with multiple queues and has "-R y" working to their
> satisfaction?

Unfortunately narrowing down the harm of #2344 is not easy. When you read 
through it you'll find that I managed to define a case where it affects 
reservation scheduling noticeably, but I couldn't rule out other side 
effects.

I have just attached a small source patch to #2344 of which I know that 
it fixes the "-R y" issue. Do you compile Grid Engine yourself? If not 
I could send you a #2344-patched scheduler 6.0u9 lx24-x86 binary so that 
you could verify whether it solves your issue as well.

Regards,
Andreas

>
> - Ross
>
>
> Quoting Daniel Templeton <Dan.Templeton at Sun.COM>:
>
>> Smells a bit like 2344:
>> 
>> http://gridengine.sunsource.net/issues/show_bug.cgi?id=2344
>> 
>> Daniel
>> 
>>>> On Wed, 19 Sep 2007, Ross Dickson wrote:
>>>> 
>>>>> Hello all.
>>>>> 
>>>>> We've got a Red Hat cluster running N1GE 6.0u9.  We've got resource 
>>>>> reservation turned on:
>>>>> 
>>>>> % qconf -ssconf | grep reservation
>>>>> max_reservation                   5
>>>>> 
>>>>> ...and four jobs in the waiting list with "-R y".  Here's one:
>>>>> 
>>>>> % qstat -j 3568 | grep reserv
>>>>> reserve:                    y
>>>>> 
>>>>> But since it went in on Sept 14, other jobs (of lower priority!) have 
>>>>> been submitted and scheduled. Here are some highlights from qstat:
>>>>> 
>>>>> job-ID  prior   name       user         state submit/start at     queue 
>>>>> slots ja-task-ID
>>>>> 
>>>>> ----------------------------------------------------------------------------------------------------------------- 
>>>>> ....
>>>>>  3566 0.52079 rs1.90_cmc itamblyn     r     09/18/2007 11:04:59 
>>>>> all.q at cl026.smu.acenet.ca          4
>>>>>  3668 0.52079 L099A      mcoates      r     09/18/2007 12:27:44 
>>>>> all.q at cl027.smu.acenet.ca          4
>>>>>  3563 0.52079 rs1.90_cmc itamblyn     r     09/13/2007 15:52:23 
>>>>> all.q at cl028.smu.acenet.ca          4
>>>>>  3667 0.52079 L022       mcoates      r     09/18/2007 12:27:44 
>>>>> all.q at cl029.smu.acenet.ca          4
>>>>> ....
>>>>>  3568 0.60500 Metis      kghazino     qw    09/14/2007 13:55:52 20
>>>>> ....
>>>>> 
>>>>> Note the start times on 3566, 3667, 3668.  When I set "params MONITOR=1" 
>>>>> in qconf -msconf, I can see that 3568 is reserving cpus:
>>>>> 
>>>>> % tail -3 /opt/n1ge6u9/default/common/schedule
>>>>> 3568:1:RESERVING:1190217135:660:Q:all.q at cl021.smu.acenet.ca:slots:1.000000 
>>>>> 3568:1:RESERVING:1190217135:660:Q:all.q at cl034.smu.acenet.ca:slots:1.000000 
>>>>> 3568:1:RESERVING:1190217135:660:Q:all.q at cl020.smu.acenet.ca:slots:1.000000 
>>>>> This looks suspiciously like a case mentioned on this mailing list in 
>>>>> Dec 2006 by Jean-Paul Minet, but no answer to his final query appears in 
>>>>> the archives.  Why are the smaller jobs getting in front of the 
>>>>> reserving job? What am I missing?
>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
> For additional commands, e-mail: users-help at gridengine.sunsource.net
>
>

http://gridengine.info/

Sitz der Gesellschaft: Sun Microsystems GmbH, Sonnenallee 1, D-85551 Kirchheim-Heimstetten
Amtsgericht Muenchen: HRB 161028
Geschaeftsfuehrer: Marcel Schneider, Wolfgang Engels, Dr. Roland Boemer
Vorsitzender des Aufsichtsrates: Martin Haering

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
For additional commands, e-mail: users-help at gridengine.sunsource.net




More information about the gridengine-users mailing list