[GE users] Reservations and calendar problem

reuti reuti at staff.uni-marburg.de
Fri Mar 12 11:06:46 GMT 2010


Am 12.03.2010 um 10:58 schrieb cru:

> To try and work around this problem, I've just tried creating an  
> advance reservation for the entire cluster as an alternative to  
> setting queues to off with the calendar. I'd hoped that the  
> behaviour of combining AR and reservations would be different to  
> calendar/reservations.
>
> Unfortunately, the behaviour is exactly the same.
>
> If I don't use the calendar and don't have an advance reservation  
> in place, a 64cpu job at the front of the queue gets a reservation  
> created for Sunday 14th March (h_rt for this job is 24 hours). With  
> either the calendar in use or an Advanced Reservation defined to  
> create a service day on Tues 13th April, the 64cpu job gets a  
> reservation for immediately after this.
>
> Is this how it's supposed to work or a bug?

No, it should behave different. To investigate this further, can you  
switch on the:

$ qconf -msconf
...
report_pjob_tickets               TRUE
...

and check the priority of all the waiting jobs. Do the serial ones  
have a high urgency attached, which pushes them up in the list of  
waiting jobs?

-- Reuti


>
> Can anyone see a different workaround such that I can get large  
> parallel jobs running with small jobs only backfilling AND empty  
> the cluster on predefined service days?
>
> Regards,
> Chris
>
> Dr Chris Rudge - Research Computing Services Manager
>
> IT Services, University of Leicester, LE1 7RH
> Tel:     +44 (0)116 2522223
> emal:  chris.rudge at le.ac.uk
>
>> -----Original Message-----
>> From: cru [mailto:cmr9 at leicester.ac.uk]
>> Sent: 10 March 2010 20:31
>> To: users at gridengine.sunsource.net
>> Subject: RE: [GE users] Reservations and calendar problem
>>
>>> Am 10.03.2010 um 20:41 schrieb cru:
>>>
>>>> Yes, we set a default request of '-l h_rt=00:01:00' forcing all
>>>> users to set the runtime for their jobs.
>>>
>>> Then you could also make it a FORCED resource request in the complex
>>> configuration.
>>
>> I wasn't aware of that option, but I think that's a side issue that's
>> not relevant to the problem I'm trying to solve.
>>
>>>
>>> And the users are not requesting h_rt=999:99:99 or h_rt=INFINITY for
>>> simplicity?
>>
>> No, if they did this their jobs wouldn't run at all because they
>> wouldn't fit before the service period.
>>
>>
>>>
>>> What does `qstat -j <job_id>` usually say about such job's time
>>> requests?
>>
>> Similar jobs currently in the queue report things like:
>>
>> # qstat -j 59508 | grep h_rt
>> hard resource_list:         h_rt=388800
>>
>> and for finished ones I can see from the qacct -j output:
>>
>> qsub_time    Wed Mar  3 16:19:34 2010
>> start_time   Fri Mar  5 02:20:44 2010
>> end_time     Tue Mar  9 14:20:45 2010
>>
>> OK, so that's 112 hours rather than 108 but you can these limits are
>> being correctly set and applied.
>>
>> Chris
>>
>>
>> Dr Chris Rudge - Research Computing Services Manager
>>
>> IT Services, University of Leicester, LE1 7RH
>> Tel: 0116 2522223
>>
>> Times Higher Education University of the year 2008/9
>>
>> ------------------------------------------------------
>> http://gridengine.sunsource.net/ds/viewMessage.do? 
>> dsForumId=38&dsMessag
>> eId=247898
>>
>> To unsubscribe from this discussion, e-mail: [users-
>> unsubscribe at gridengine.sunsource.net].
>
> ------------------------------------------------------
> http://gridengine.sunsource.net/ds/viewMessage.do? 
> dsForumId=38&dsMessageId=248140
>
> To unsubscribe from this discussion, e-mail: [users- 
> unsubscribe at gridengine.sunsource.net].
>

------------------------------------------------------
http://gridengine.sunsource.net/ds/viewMessage.do?dsForumId=38&dsMessageId=248144

To unsubscribe from this discussion, e-mail: [users-unsubscribe at gridengine.sunsource.net].



More information about the gridengine-users mailing list