[GE users] resource reservation not working

Ross Dickson Ross.Dickson at dal.ca
Wed Sep 19 18:34:43 BST 2007


    [ The following text is in the "ISO-8859-1" character set. ]
    [ Your display is set for the "ISO-8859-10" character set.  ]
    [ Some special characters may be displayed incorrectly. ]

Hi Andreas.

 > qconf -ssconf | egrep "weight_urgency|weight_priority|weight_ticket"
weight_tickets_functional         0
weight_tickets_share              0
weight_ticket                     0.010000
weight_urgency                    0.100000
weight_priority                   1.000000

The only thing in the complex with an urgency value is "slots"...
 > qconf -sc | grep slots
slots               s          INT         <=    YES         YES        
1        1000
...everything else has zero.

I can't think of any way that the smaller jobs could have been higher 
priority than #3568, but I'm pretty new at this and many things are 
still obscure to me.  Job priorities, in my experience, are determined 
by the slot count (more slots --> higher priority).  We did nothing like 
"qalter -p" that would either lower 3568 or raise the other jobs' posix 
priority.

There are 4 (and only 4) effectively identical jobs queued up with "-R 
y".  Only one is showing reservations in the "schedule" file, but that 
doesn't trouble me.  If I could get one of them going it would at least 
demonstrate that reservation works.

Cheers,
Ross


Andreas.Haas at Sun.COM wrote:
> Hi Ross,
>
> are you sure 3568 had higher priority also at the time when these 
> smaller jobs were assigned? Could it be 3568 got no reservation in the 
> meantime due to small max_reservation of 5? How you ensure jobs like 
> 3568 get high priority? I would expect you are using urgency 
> contribution of 1000 for the
> 'slots' resource. Is there any other resource with a significant 
> urgency contribution?
>
> What weights are you using for priorities:
>
>  # qconf -ssconf | egrep "weight_urgency|weight_priority|weight_ticket"
>
> Regards,
> Andreas
>
>
> On Wed, 19 Sep 2007, Ross Dickson wrote:
>
>> Hello all.
>>
>> We've got a Red Hat cluster running N1GE 6.0u9.  We've got resource 
>> reservation turned on:
>>
>> % qconf -ssconf | grep reservation
>> max_reservation                   5
>>
>> ...and four jobs in the waiting list with "-R y".  Here's one:
>>
>> % qstat -j 3568 | grep reserv
>> reserve:                    y
>>
>> But since it went in on Sept 14, other jobs (of lower priority!) have 
>> been submitted and scheduled. Here are some highlights from qstat:
>>
>> job-ID  prior   name       user         state submit/start at     
>> queue slots ja-task-ID
>> ----------------------------------------------------------------------------------------------------------------- 
>>
>> ....
>>  3566 0.52079 rs1.90_cmc itamblyn     r     09/18/2007 11:04:59 
>> all.q at cl026.smu.acenet.ca          4
>>  3668 0.52079 L099A      mcoates      r     09/18/2007 12:27:44 
>> all.q at cl027.smu.acenet.ca          4
>>  3563 0.52079 rs1.90_cmc itamblyn     r     09/13/2007 15:52:23 
>> all.q at cl028.smu.acenet.ca          4
>>  3667 0.52079 L022       mcoates      r     09/18/2007 12:27:44 
>> all.q at cl029.smu.acenet.ca          4
>> ....
>>  3568 0.60500 Metis      kghazino     qw    09/14/2007 13:55:52 20
>> ....
>>
>> Note the start times on 3566, 3667, 3668.  When I set "params 
>> MONITOR=1" in qconf -msconf, I can see that 3568 is reserving cpus:
>>
>> % tail -3 /opt/n1ge6u9/default/common/schedule
>> 3568:1:RESERVING:1190217135:660:Q:all.q at cl021.smu.acenet.ca:slots:1.000000 
>>
>> 3568:1:RESERVING:1190217135:660:Q:all.q at cl034.smu.acenet.ca:slots:1.000000 
>>
>> 3568:1:RESERVING:1190217135:660:Q:all.q at cl020.smu.acenet.ca:slots:1.000000 
>>
>>
>> This looks suspiciously like a case mentioned on this mailing list in 
>> Dec 2006 by Jean-Paul Minet, but no answer to his final query appears 
>> in the archives.  Why are the smaller jobs getting in front of the 
>> reserving job? What am I missing?

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
For additional commands, e-mail: users-help at gridengine.sunsource.net




More information about the gridengine-users mailing list