[GE users] Advance reservation strange behavior

Sili (wesley) Huang shuang at unb.ca
Tue Jun 27 19:25:26 BST 2006


    [ The following text is in the "iso-8859-1" character set. ]
    [ Your display is set for the "ISO-8859-10" character set.  ]
    [ Some special characters may be displayed incorrectly. ]


Hi Andreas,


If I recall correctly, h_rt is a hard limit for wall clock time. If I specify -h
h_rt=0:10:0 to $SGE_ROOT/default/common/sge_request file, then the jobs run more
than 10 minutes of wall clock time will be killed. Of course, I can ask users to
specify h_rt when using qsub, but this is not the way I want to go because there
are many long jobs (days to weeks) in our cluster and I do not want to add in a
layer of complexity to our users. 


Is there any way to work around this problem without specifying the hard limit
of h_rt? E.g. any way I can configure the reservation so that I can reserve all
slots for a reservation (so that no serial job can fill the released slots) but
only use some of them when the job is dispatched? 


Cheers. 


Best regards,

Sili(wesley) Huang


Tuesday, June 27, 2006, 5:40:43 AM, you wrote:


Andreas> Most probably your suspicion is right. The parallel jobs' reservation 

Andreas> sure enough becomes valueless, if the sequential jobs do not finish

Andreas> at the time foreseen by the scheduler and default_duration gets not 

Andreas> enforced by Grid Engine. Have you considered putting some


Andreas>     -l h_rt=:10:


Andreas> into cluster-wise sge_request(5) file?


Andreas> Regards,

Andreas> Andreas


Andreas> On Mon, 26 Jun 2006, Sili (wesley) Huang wrote:



>> Hi Andreas,



>> I tried to observed on what were going on with this strange behavior. It
seems to me that the reservation is

>> tight with the specified run-length of a job. For example, in this record of
monitoring (385889 is a parallel

>> job with reservation enabled and having high priority, and 385865 is a serial
job): 



>> [root common]#  cat schedule | egrep "385865|385889|::::::::"


>> ::::::::


>> 385889:1:RESERVING:1151341235:3660:P:mpich:slots:12.000000


>> 385889:1:RESERVING:1151341235:3660:G:global:ncpus_agerber:12.000000


>> 385889:1:RESERVING:1151341235:3660:H:v60-n28:singular:2.000000


>> 385889:1:RESERVING:1151341235:3660:H:v60-n65:singular:2.000000


>> 385889:1:RESERVING:1151341235:3660:H:v60-n75:singular:1.000000


>> 385889:1:RESERVING:1151341235:3660:H:v60-n66:singular:2.000000


>> 385889:1:RESERVING:1151341235:3660:H:v60-n31:singular:2.000000


>> 385889:1:RESERVING:1151341235:3660:H:v60-n62:singular:1.000000


>> 385889:1:RESERVING:1151341235:3660:H:v60-n15:singular:2.000000


>> 385889:1:RESERVING:1151341235:3660:Q:all.q at v60-n28:slots:2.000000


>> 385889:1:RESERVING:1151341235:3660:Q:all.q at v60-n65:slots:2.000000


>> 385889:1:RESERVING:1151341235:3660:Q:all.q at v60-n75:slots:1.000000


>> 385889:1:RESERVING:1151341235:3660:Q:all.q at v60-n31:slots:2.000000


>> 385889:1:RESERVING:1151341235:3660:Q:all.q at v60-n66:slots:2.000000


>> 385889:1:RESERVING:1151341235:3660:Q:all.q at v60-n62:slots:1.000000


>> 385889:1:RESERVING:1151341235:3660:Q:all.q at v60-n15:slots:2.000000


>> ::::::::


>> 385889:1:RESERVING:1151341250:3660:P:mpich:slots:12.000000


>> 385889:1:RESERVING:1151341250:3660:G:global:ncpus_agerber:12.000000


>> 385889:1:RESERVING:1151341250:3660:H:v60-n28:singular:2.000000


>> 385889:1:RESERVING:1151341250:3660:H:v60-n65:singular:2.000000


>> 385889:1:RESERVING:1151341250:3660:H:v60-n75:singular:1.000000


>> 385889:1:RESERVING:1151341250:3660:H:v60-n62:singular:1.000000


>> 385889:1:RESERVING:1151341250:3660:H:v60-n73:singular:2.000000


>> 385889:1:RESERVING:1151341250:3660:H:v60-n52:singular:2.000000


>> 385889:1:RESERVING:1151341250:3660:H:v60-n66:singular:2.000000


>> 385889:1:RESERVING:1151341250:3660:Q:all.q at v60-n28:slots:2.000000


>> 385889:1:RESERVING:1151341250:3660:Q:all.q at v60-n65:slots:2.000000


>> 385889:1:RESERVING:1151341250:3660:Q:all.q at v60-n75:slots:1.000000


>> 385889:1:RESERVING:1151341250:3660:Q:all.q at v60-n62:slots:1.000000


>> 385889:1:RESERVING:1151341250:3660:Q:all.q at v60-n73:slots:2.000000


>> 385889:1:RESERVING:1151341250:3660:Q:all.q at v60-n52:slots:2.000000


>> 385889:1:RESERVING:1151341250:3660:Q:all.q at v60-n66:slots:2.000000


>> ::::::::


>> 385889:1:RESERVING:1151341265:3660:P:mpich:slots:12.000000


>> 385889:1:RESERVING:1151341265:3660:G:global:ncpus_agerber:12.000000


>> 385889:1:RESERVING:1151341265:3660:H:v60-n28:singular:2.000000


>> 385889:1:RESERVING:1151341265:3660:H:v60-n65:singular:2.000000


>> 385889:1:RESERVING:1151341265:3660:H:v60-n75:singular:1.000000


>> 385889:1:RESERVING:1151341265:3660:H:v60-n62:singular:1.000000


>> 385889:1:RESERVING:1151341265:3660:H:v60-n73:singular:2.000000


>> 385889:1:RESERVING:1151341265:3660:H:v60-n52:singular:2.000000


>> 385889:1:RESERVING:1151341265:3660:H:v60-n66:singular:2.000000


>> 385889:1:RESERVING:1151341265:3660:Q:all.q at v60-n28:slots:2.000000


>> 385889:1:RESERVING:1151341265:3660:Q:all.q at v60-n65:slots:2.000000


>> 385889:1:RESERVING:1151341265:3660:Q:all.q at v60-n75:slots:1.000000


>> 385889:1:RESERVING:1151341265:3660:Q:all.q at v60-n62:slots:1.000000


>> 385889:1:RESERVING:1151341265:3660:Q:all.q at v60-n73:slots:2.000000


>> 385889:1:RESERVING:1151341265:3660:Q:all.q at v60-n52:slots:2.000000


>> 385889:1:RESERVING:1151341265:3660:Q:all.q at v60-n66:slots:2.000000


>> 385865:1:STARTING:1151341250:3660:H:v60-n47:singular:1.000000


>> 385865:1:STARTING:1151341250:3660:Q:all.q at v60-n47:slots:1.000000


>> ::::::::


>> 385865:1:RUNNING:1151341251:3660:H:v60-n47:singular:1.000000


>> 385865:1:RUNNING:1151341251:3660:Q:all.q at v60-n47:slots:1.000000


>> ::::::::



>> I suspect that SGE behavior is because: 



>> It seems to me that SGE is trying to reserve the processor resources which
are expected to be released soonest.

>> SGE determines which CPUs to be reserved by h_rt or s_rt or default_duration
by default. However, in our

>> cluster, we do not require users to specify h_rt or s_rt, so a
default_duration specified as one hour is used.

>> Therefore, if a serial job is finished very short, e.g. 10 minutes, SGE
doesn't reserve this CPU resource to the

>> reservation and hence the serial jobs still fill this CPU at the time it is
released. The same to the scenario

>> where a long job is occupying a CPU, e.g. 2 days, and SGE is always expecting
this CPU can be released soon and

>> reserves it to the reservation. 



>> My suspicions may be wrong. It would be great if someone having the same
problem can observe in their SGEs. If

>> my suspicions are correct, I think this is an odd implementation on
reservation since the reservation should not

>> only based on runtime specified. 



>> Cheers. 



>> Best regards,


>> Sili(wesley) Huang



>> Monday, June 26, 2006, 5:41:25 AM, you wrote:



>> Andreas> Have you observed reservation behaviour via the 'schedule' file?



>> Andreas> Andreas



>> Andreas> On Fri, 23 Jun 2006, Brady Catherman wrote:



>> >> Yes. If there is space they start fine. If they have reservation enabled,
and 


>> >> they have a much higher priority than every other single process job they 


>> >> just sit at the top of the queue as if the reservation is not doing
anything 


>> >> (max_reservations is currently set at 1000)




>> >> On Jun 23, 2006, at 2:07 PM, Reuti wrote:



>> >>> Am 23.06.2006 um 22:45 schrieb Brady Catherman:



>> >>>> I have done both of these and yet my clusters still hate parallel jobs. 


>> >>>> Does anybody have this working? everything I have seen is that parallel 


>> >>>> jobs are always shunned by grid engine. I would appreciate any
solutions 


>> >>>> to this being passed my way! =) I have been working on this on and off 


>> >>>> since January.



>> >>> But if the cluster is empty, they are starting? - Reuti





>> >>>> On Jun 23, 2006, at 11:46 AM, Reuti wrote:



>> >>>>> Hi,



>> >>>>> you submitted with "-R y" and adjusted the scheduler to
"max_reservation 


>> >>>>> 20" or an appropriate value?



>> >>>>> -- Reuti




>> >>>>> Am 23.06.2006 um 18:31 schrieb Sili (wesley) Huang:



>> >>>>>> Hi Jean-Paul,





>> >>>>>> I have the similar problem as yours in our cluster. the low-priority 


>> >>>>>> serial jobs still get loaded into run state and the high-priority 


>> >>>>>> parallel jobs are waiting. Did you figure out the solution towards
this 


>> >>>>>> problem? Does the upgrade help?





>> >>>>>> Cheers.





>> >>>>>> Best regards,



>> >>>>>> Sili(wesley) Huang





>> >>>>>> --



>> >>>>>> mailto:shuang at unb.ca



>> >>>>>> Scientific Computing Support



>> >>>>>> Advanced Computational Research Laboratory



>> >>>>>> University of New Brunswick



>> >>>>>> Tel(office):  (506) 452-6348



>> >>>>>> ---------------------------------------------------------------------
To 


>> >>>>>> unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net For 


>> >>>>>> additional commands, e-mail: users-help at gridengine.sunsource.net



>> >>>>> ---------------------------------------------------------------------


>> >>>>> To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net


>> >>>>> For additional commands, e-mail: users-help at gridengine.sunsource.net




>> >>>> ---------------------------------------------------------------------


>> >>>> To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net


>> >>>> For additional commands, e-mail: users-help at gridengine.sunsource.net



>> >>> ---------------------------------------------------------------------


>> >>> To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net


>> >>> For additional commands, e-mail: users-help at gridengine.sunsource.net




>> >> ---------------------------------------------------------------------


>> >> To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net


>> >> For additional commands, e-mail: users-help at gridengine.sunsource.net




>> Andreas>
---------------------------------------------------------------------


>> Andreas> To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net


>> Andreas> For additional commands, e-mail: users-help at gridengine.sunsource.net




>> --


>> mailto:shuang at unb.ca


>> Scientific Computing Support


>> Advanced Computational Research Laboratory


>> University of New Brunswick


>> Tel(office):  (506) 452-6348







--

mailto:shuang at unb.ca

Scientific Computing Support

Advanced Computational Research Laboratory

University of New Brunswick

Tel(office):  (506) 452-6348

--------------------------------------------------------------------- To
unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net For additional
commands, e-mail: users-help at gridengine.sunsource.net



More information about the gridengine-users mailing list