[GE users] Advance reservation strange behavior

Brady Catherman bradyc at uidaho.edu
Tue Jun 27 20:20:52 BST 2006


We don't use time limits either because some jobs we run take seconds  
and others take months and there is often no way to tell the  
difference between the two wen submitting.

What would be really nice is something that would give me the ability  
to tell GE to only look at x jobs from the top for scheduling. That  
way I could set it to act as though there is only 1 job in the queue  
at a time. Once that job is scheduled then it moves to the next job.  
This is about the only way to make scheduling efficient in our  
environment. (Yes! I understand that this will slow down the job  
starting process a lo but that isn't an issue for us)

Any way to make this doable?


On Jun 27, 2006, at 12:08 PM, Reuti wrote:

> Hi,
>
> Am 27.06.2006 um 20:25 schrieb Sili (wesley) Huang:
>
>> Hi Andreas,
>>
>>
>>
>> If I recall correctly, h_rt is a hard limit for wall clock time.  
>> If I specify -h h_rt=0:10:0 to $SGE_ROOT/default/common/ 
>> sge_request file, then the jobs run more than 10 minutes of wall  
>> clock time will be killed. Of course, I can ask users to specify  
>> h_rt when using qsub, but this is not the way I want to go because  
>> there are many long jobs (days to weeks) in our cluster and I do  
>> not want to add in a layer of complexity to our users.
>>
>>
>>
>> Is there any way to work around this problem without specifying  
>> the hard limit of h_rt? E.g. any way I can configure the  
>> reservation so that I can reserve all slots for a reservation (so  
>> that no serial job can fill the released slots) but only use some  
>> of them when the job is dispatched?
> can you have to queues: one only for parallel, one only for serial  
> jobs. This way you could a) suspend serial jobs, or b) push them in  
> the background by setting the priority (i.e. nice values) to 0 and  
> +19.
>
> Or if you like to dry out the serial queue tis is working again:
>
> http://gridengine.sunsource.net/issues/show_bug.cgi?id=464
>
> -- Reuti
>
>> Cheers.
>>
>>
>>
>> Best regards,
>>
>> Sili(wesley) Huang
>>
>>
>>
>> Tuesday, June 27, 2006, 5:40:43 AM, you wrote:
>>
>>
>>
>> Andreas> Most probably your suspicion is right. The parallel jobs'  
>> reservation
>>
>> Andreas> sure enough becomes valueless, if the sequential jobs do  
>> not finish
>>
>> Andreas> at the time foreseen by the scheduler and  
>> default_duration gets not
>>
>> Andreas> enforced by Grid Engine. Have you considered putting some
>>
>>
>>
>> Andreas>     -l h_rt=:10:
>>
>>
>>
>> Andreas> into cluster-wise sge_request(5) file?
>>
>>
>>
>> Andreas> Regards,
>>
>> Andreas> Andreas
>>
>>
>>
>> Andreas> On Mon, 26 Jun 2006, Sili (wesley) Huang wrote:
>>
>>
>>
>>
>>
>> >> Hi Andreas,
>>
>>
>>
>>
>>
>> >> I tried to observed on what were going on with this strange  
>> behavior. It seems to me that the reservation is
>>
>> >> tight with the specified run-length of a job. For example, in  
>> this record of monitoring (385889 is a parallel
>>
>> >> job with reservation enabled and having high priority, and  
>> 385865 is a serial job):
>>
>>
>>
>>
>>
>> >> [root common]#  cat schedule | egrep "385865|385889|::::::::"
>>
>>
>>
>> >> ::::::::
>>
>>
>>
>> >> 385889:1:RESERVING:1151341235:3660:P:mpich:slots:12.000000
>>
>>
>>
>> >> 385889:1:RESERVING:1151341235:3660:G:global:ncpus_agerber: 
>> 12.000000
>>
>>
>>
>> >> 385889:1:RESERVING:1151341235:3660:H:v60-n28:singular:2.000000
>>
>>
>>
>> >> 385889:1:RESERVING:1151341235:3660:H:v60-n65:singular:2.000000
>>
>>
>>
>> >> 385889:1:RESERVING:1151341235:3660:H:v60-n75:singular:1.000000
>>
>>
>>
>> >> 385889:1:RESERVING:1151341235:3660:H:v60-n66:singular:2.000000
>>
>>
>>
>> >> 385889:1:RESERVING:1151341235:3660:H:v60-n31:singular:2.000000
>>
>>
>>
>> >> 385889:1:RESERVING:1151341235:3660:H:v60-n62:singular:1.000000
>>
>>
>>
>> >> 385889:1:RESERVING:1151341235:3660:H:v60-n15:singular:2.000000
>>
>>
>>
>> >> 385889:1:RESERVING:1151341235:3660:Q:all.q at v60-n28:slots:2.000000
>>
>>
>>
>> >> 385889:1:RESERVING:1151341235:3660:Q:all.q at v60-n65:slots:2.000000
>>
>>
>>
>> >> 385889:1:RESERVING:1151341235:3660:Q:all.q at v60-n75:slots:1.000000
>>
>>
>>
>> >> 385889:1:RESERVING:1151341235:3660:Q:all.q at v60-n31:slots:2.000000
>>
>>
>>
>> >> 385889:1:RESERVING:1151341235:3660:Q:all.q at v60-n66:slots:2.000000
>>
>>
>>
>> >> 385889:1:RESERVING:1151341235:3660:Q:all.q at v60-n62:slots:1.000000
>>
>>
>>
>> >> 385889:1:RESERVING:1151341235:3660:Q:all.q at v60-n15:slots:2.000000
>>
>>
>>
>> >> ::::::::
>>
>>
>>
>> >> 385889:1:RESERVING:1151341250:3660:P:mpich:slots:12.000000
>>
>>
>>
>> >> 385889:1:RESERVING:1151341250:3660:G:global:ncpus_agerber: 
>> 12.000000
>>
>>
>>
>> >> 385889:1:RESERVING:1151341250:3660:H:v60-n28:singular:2.000000
>>
>>
>>
>> >> 385889:1:RESERVING:1151341250:3660:H:v60-n65:singular:2.000000
>>
>>
>>
>> >> 385889:1:RESERVING:1151341250:3660:H:v60-n75:singular:1.000000
>>
>>
>>
>> >> 385889:1:RESERVING:1151341250:3660:H:v60-n62:singular:1.000000
>>
>>
>>
>> >> 385889:1:RESERVING:1151341250:3660:H:v60-n73:singular:2.000000
>>
>>
>>
>> >> 385889:1:RESERVING:1151341250:3660:H:v60-n52:singular:2.000000
>>
>>
>>
>> >> 385889:1:RESERVING:1151341250:3660:H:v60-n66:singular:2.000000
>>
>>
>>
>> >> 385889:1:RESERVING:1151341250:3660:Q:all.q at v60-n28:slots:2.000000
>>
>>
>>
>> >> 385889:1:RESERVING:1151341250:3660:Q:all.q at v60-n65:slots:2.000000
>>
>>
>>
>> >> 385889:1:RESERVING:1151341250:3660:Q:all.q at v60-n75:slots:1.000000
>>
>>
>>
>> >> 385889:1:RESERVING:1151341250:3660:Q:all.q at v60-n62:slots:1.000000
>>
>>
>>
>> >> 385889:1:RESERVING:1151341250:3660:Q:all.q at v60-n73:slots:2.000000
>>
>>
>>
>> >> 385889:1:RESERVING:1151341250:3660:Q:all.q at v60-n52:slots:2.000000
>>
>>
>>
>> >> 385889:1:RESERVING:1151341250:3660:Q:all.q at v60-n66:slots:2.000000
>>
>>
>>
>> >> ::::::::
>>
>>
>>
>> >> 385889:1:RESERVING:1151341265:3660:P:mpich:slots:12.000000
>>
>>
>>
>> >> 385889:1:RESERVING:1151341265:3660:G:global:ncpus_agerber: 
>> 12.000000
>>
>>
>>
>> >> 385889:1:RESERVING:1151341265:3660:H:v60-n28:singular:2.000000
>>
>>
>>
>> >> 385889:1:RESERVING:1151341265:3660:H:v60-n65:singular:2.000000
>>
>>
>>
>> >> 385889:1:RESERVING:1151341265:3660:H:v60-n75:singular:1.000000
>>
>>
>>
>> >> 385889:1:RESERVING:1151341265:3660:H:v60-n62:singular:1.000000
>>
>>
>>
>> >> 385889:1:RESERVING:1151341265:3660:H:v60-n73:singular:2.000000
>>
>>
>>
>> >> 385889:1:RESERVING:1151341265:3660:H:v60-n52:singular:2.000000
>>
>>
>>
>> >> 385889:1:RESERVING:1151341265:3660:H:v60-n66:singular:2.000000
>>
>>
>>
>> >> 385889:1:RESERVING:1151341265:3660:Q:all.q at v60-n28:slots:2.000000
>>
>>
>>
>> >> 385889:1:RESERVING:1151341265:3660:Q:all.q at v60-n65:slots:2.000000
>>
>>
>>
>> >> 385889:1:RESERVING:1151341265:3660:Q:all.q at v60-n75:slots:1.000000
>>
>>
>>
>> >> 385889:1:RESERVING:1151341265:3660:Q:all.q at v60-n62:slots:1.000000
>>
>>
>>
>> >> 385889:1:RESERVING:1151341265:3660:Q:all.q at v60-n73:slots:2.000000
>>
>>
>>
>> >> 385889:1:RESERVING:1151341265:3660:Q:all.q at v60-n52:slots:2.000000
>>
>>
>>
>> >> 385889:1:RESERVING:1151341265:3660:Q:all.q at v60-n66:slots:2.000000
>>
>>
>>
>> >> 385865:1:STARTING:1151341250:3660:H:v60-n47:singular:1.000000
>>
>>
>>
>> >> 385865:1:STARTING:1151341250:3660:Q:all.q at v60-n47:slots:1.000000
>>
>>
>>
>> >> ::::::::
>>
>>
>>
>> >> 385865:1:RUNNING:1151341251:3660:H:v60-n47:singular:1.000000
>>
>>
>>
>> >> 385865:1:RUNNING:1151341251:3660:Q:all.q at v60-n47:slots:1.000000
>>
>>
>>
>> >> ::::::::
>>
>>
>>
>>
>>
>> >> I suspect that SGE behavior is because:
>>
>>
>>
>>
>>
>> >> It seems to me that SGE is trying to reserve the processor  
>> resources which are expected to be released soonest.
>>
>> >> SGE determines which CPUs to be reserved by h_rt or s_rt or  
>> default_duration by default. However, in our
>>
>> >> cluster, we do not require users to specify h_rt or s_rt, so a  
>> default_duration specified as one hour is used.
>>
>> >> Therefore, if a serial job is finished very short, e.g. 10  
>> minutes, SGE doesn't reserve this CPU resource to the
>>
>> >> reservation and hence the serial jobs still fill this CPU at  
>> the time it is released. The same to the scenario
>>
>> >> where a long job is occupying a CPU, e.g. 2 days, and SGE is  
>> always expecting this CPU can be released soon and
>>
>> >> reserves it to the reservation.
>>
>>
>>
>>
>>
>> >> My suspicions may be wrong. It would be great if someone having  
>> the same problem can observe in their SGEs. If
>>
>> >> my suspicions are correct, I think this is an odd  
>> implementation on reservation since the reservation should not
>>
>> >> only based on runtime specified.
>>
>>
>>
>>
>>
>> >> Cheers.
>>
>>
>>
>>
>>
>> >> Best regards,
>>
>>
>>
>> >> Sili(wesley) Huang
>>
>>
>>
>>
>>
>> >> Monday, June 26, 2006, 5:41:25 AM, you wrote:
>>
>>
>>
>>
>>
>> >> Andreas> Have you observed reservation behaviour via the  
>> 'schedule' file?
>>
>>
>>
>>
>>
>> >> Andreas> Andreas
>>
>>
>>
>>
>>
>> >> Andreas> On Fri, 23 Jun 2006, Brady Catherman wrote:
>>
>>
>>
>>
>>
>> >> >> Yes. If there is space they start fine. If they have  
>> reservation enabled, and
>>
>>
>>
>> >> >> they have a much higher priority than every other single  
>> process job they
>>
>>
>>
>> >> >> just sit at the top of the queue as if the reservation is  
>> not doing anything
>>
>>
>>
>> >> >> (max_reservations is currently set at 1000)
>>
>>
>>
>>
>>
>>
>>
>> >> >> On Jun 23, 2006, at 2:07 PM, Reuti wrote:
>>
>>
>>
>>
>>
>> >> >>> Am 23.06.2006 um 22:45 schrieb Brady Catherman:
>>
>>
>>
>>
>>
>> >> >>>> I have done both of these and yet my clusters still hate  
>> parallel jobs.
>>
>>
>>
>> >> >>>> Does anybody have this working? everything I have seen is  
>> that parallel
>>
>>
>>
>> >> >>>> jobs are always shunned by grid engine. I would appreciate  
>> any solutions
>>
>>
>>
>> >> >>>> to this being passed my way! =) I have been working on  
>> this on and off
>>
>>
>>
>> >> >>>> since January.
>>
>>
>>
>>
>>
>> >> >>> But if the cluster is empty, they are starting? - Reuti
>>
>>
>>
>>
>>
>>
>>
>>
>>
>> >> >>>> On Jun 23, 2006, at 11:46 AM, Reuti wrote:
>>
>>
>>
>>
>>
>> >> >>>>> Hi,
>>
>>
>>
>>
>>
>> >> >>>>> you submitted with "-R y" and adjusted the scheduler to  
>> "max_reservation
>>
>>
>>
>> >> >>>>> 20" or an appropriate value?
>>
>>
>>
>>
>>
>> >> >>>>> -- Reuti
>>
>>
>>
>>
>>
>>
>>
>> >> >>>>> Am 23.06.2006 um 18:31 schrieb Sili (wesley) Huang:
>>
>>
>>
>>
>>
>> >> >>>>>> Hi Jean-Paul,
>>
>>
>>
>>
>>
>>
>>
>>
>>
>> >> >>>>>> I have the similar problem as yours in our cluster. the  
>> low-priority
>>
>>
>>
>> >> >>>>>> serial jobs still get loaded into run state and the high- 
>> priority
>>
>>
>>
>> >> >>>>>> parallel jobs are waiting. Did you figure out the  
>> solution towards this
>>
>>
>>
>> >> >>>>>> problem? Does the upgrade help?
>>
>>
>>
>>
>>
>>
>>
>>
>>
>> >> >>>>>> Cheers.
>>
>>
>>
>>
>>
>>
>>
>>
>>
>> >> >>>>>> Best regards,
>>
>>
>>
>>
>>
>> >> >>>>>> Sili(wesley) Huang
>>
>>
>>
>>
>>
>>
>>
>>
>>
>> >> >>>>>> --
>>
>>
>>
>>
>>
>> >> >>>>>> mailto:shuang at unb.ca
>>
>>
>>
>>
>>
>> >> >>>>>> Scientific Computing Support
>>
>>
>>
>>
>>
>> >> >>>>>> Advanced Computational Research Laboratory
>>
>>
>>
>>
>>
>> >> >>>>>> University of New Brunswick
>>
>>
>>
>>
>>
>> >> >>>>>> Tel(office):  (506) 452-6348
>>
>>
>>
>>
>>
>> >> >>>>>>  
>> --------------------------------------------------------------------- 
>>  To
>>
>>
>>
>> >> >>>>>> unsubscribe, e-mail: users- 
>> unsubscribe at gridengine.sunsource.net For
>>
>>
>>
>> >> >>>>>> additional commands, e-mail: users- 
>> help at gridengine.sunsource.net
>>
>>
>>
>>
>>
>> >> >>>>>  
>> ---------------------------------------------------------------------
>>
>>
>>
>> >> >>>>> To unsubscribe, e-mail: users- 
>> unsubscribe at gridengine.sunsource.net
>>
>>
>>
>> >> >>>>> For additional commands, e-mail: users- 
>> help at gridengine.sunsource.net
>>
>>
>>
>>
>>
>>
>>
>> >> >>>>  
>> ---------------------------------------------------------------------
>>
>>
>>
>> >> >>>> To unsubscribe, e-mail: users- 
>> unsubscribe at gridengine.sunsource.net
>>
>>
>>
>> >> >>>> For additional commands, e-mail: users- 
>> help at gridengine.sunsource.net
>>
>>
>>
>>
>>
>> >> >>>  
>> ---------------------------------------------------------------------
>>
>>
>>
>> >> >>> To unsubscribe, e-mail: users- 
>> unsubscribe at gridengine.sunsource.net
>>
>>
>>
>> >> >>> For additional commands, e-mail: users- 
>> help at gridengine.sunsource.net
>>
>>
>>
>>
>>
>>
>>
>> >> >>  
>> ---------------------------------------------------------------------
>>
>>
>>
>> >> >> To unsubscribe, e-mail: users- 
>> unsubscribe at gridengine.sunsource.net
>>
>>
>>
>> >> >> For additional commands, e-mail: users- 
>> help at gridengine.sunsource.net
>>
>>
>>
>>
>>
>>
>>
>> >> Andreas>  
>> ---------------------------------------------------------------------
>>
>>
>>
>> >> Andreas> To unsubscribe, e-mail: users- 
>> unsubscribe at gridengine.sunsource.net
>>
>>
>>
>> >> Andreas> For additional commands, e-mail: users- 
>> help at gridengine.sunsource.net
>>
>>
>>
>>
>>
>>
>>
>> >> --
>>
>>
>>
>> >> mailto:shuang at unb.ca
>>
>>
>>
>> >> Scientific Computing Support
>>
>>
>>
>> >> Advanced Computational Research Laboratory
>>
>>
>>
>> >> University of New Brunswick
>>
>>
>>
>> >> Tel(office):  (506) 452-6348
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>> --
>>
>> mailto:shuang at unb.ca
>>
>> Scientific Computing Support
>>
>> Advanced Computational Research Laboratory
>>
>> University of New Brunswick
>>
>> Tel(office):  (506) 452-6348
>>
>> --------------------------------------------------------------------- 
>>  To unsubscribe, e-mail: users- 
>> unsubscribe at gridengine.sunsource.net For additional commands, e- 
>> mail: users-help at gridengine.sunsource.net
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
> For additional commands, e-mail: users-help at gridengine.sunsource.net
>

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
For additional commands, e-mail: users-help at gridengine.sunsource.net




More information about the gridengine-users mailing list