[GE users] Reservations and calendar problem

cru cmr9 at leicester.ac.uk
Fri Mar 12 09:58:26 GMT 2010


To try and work around this problem, I've just tried creating an advance reservation for the entire cluster as an alternative to setting queues to off with the calendar. I'd hoped that the behaviour of combining AR and reservations would be different to calendar/reservations.

Unfortunately, the behaviour is exactly the same.

If I don't use the calendar and don't have an advance reservation in place, a 64cpu job at the front of the queue gets a reservation created for Sunday 14th March (h_rt for this job is 24 hours). With either the calendar in use or an Advanced Reservation defined to create a service day on Tues 13th April, the 64cpu job gets a reservation for immediately after this.

Is this how it's supposed to work or a bug?

Can anyone see a different workaround such that I can get large parallel jobs running with small jobs only backfilling AND empty the cluster on predefined service days?

Regards,
Chris

Dr Chris Rudge - Research Computing Services Manager

IT Services, University of Leicester, LE1 7RH
Tel:     +44 (0)116 2522223
emal:  chris.rudge at le.ac.uk

> -----Original Message-----
> From: cru [mailto:cmr9 at leicester.ac.uk]
> Sent: 10 March 2010 20:31
> To: users at gridengine.sunsource.net
> Subject: RE: [GE users] Reservations and calendar problem
> 
> > Am 10.03.2010 um 20:41 schrieb cru:
> >
> > > Yes, we set a default request of '-l h_rt=00:01:00' forcing all
> > > users to set the runtime for their jobs.
> >
> > Then you could also make it a FORCED resource request in the complex
> > configuration.
> 
> I wasn't aware of that option, but I think that's a side issue that's
> not relevant to the problem I'm trying to solve.
> 
> >
> > And the users are not requesting h_rt=999:99:99 or h_rt=INFINITY for
> > simplicity?
> 
> No, if they did this their jobs wouldn't run at all because they
> wouldn't fit before the service period.
> 
> 
> >
> > What does `qstat -j <job_id>` usually say about such job's time
> > requests?
> 
> Similar jobs currently in the queue report things like:
> 
> # qstat -j 59508 | grep h_rt
> hard resource_list:         h_rt=388800
> 
> and for finished ones I can see from the qacct -j output:
> 
> qsub_time    Wed Mar  3 16:19:34 2010
> start_time   Fri Mar  5 02:20:44 2010
> end_time     Tue Mar  9 14:20:45 2010
> 
> OK, so that's 112 hours rather than 108 but you can these limits are
> being correctly set and applied.
> 
> Chris
> 
> 
> Dr Chris Rudge - Research Computing Services Manager
> 
> IT Services, University of Leicester, LE1 7RH
> Tel: 0116 2522223
> 
> Times Higher Education University of the year 2008/9
> 
> ------------------------------------------------------
> http://gridengine.sunsource.net/ds/viewMessage.do?dsForumId=38&dsMessag
> eId=247898
> 
> To unsubscribe from this discussion, e-mail: [users-
> unsubscribe at gridengine.sunsource.net].

------------------------------------------------------
http://gridengine.sunsource.net/ds/viewMessage.do?dsForumId=38&dsMessageId=248140

To unsubscribe from this discussion, e-mail: [users-unsubscribe at gridengine.sunsource.net].



More information about the gridengine-users mailing list