[GE dev] Review for 3 man CRs

mpospisil michael.pospisil at sun.com
Tue Jul 7 20:29:22 BST 2009


Hello Honza,
please see notes below.

Michael


Jan Forch wrote:

> Hi team,
> please could someone (preferably Michael ;-) ) do review of few man 
> CRs for me:
>
> 1)
> CR 6806918 Man page entry for SCHEDULER_TIMEOUT incorrectly states 
> that default is 600
> -man page sge_conf.5
>
> cvs diff sge_conf.5
> Index: sge_conf.5
> ===================================================================
> RCS file: /cvs/gridengine/doc/man/man5/sge_conf.5,v
> retrieving revision 1.87
> diff -r1.87 sge_conf.5
> 1186,1188c1186,1189
> < Setting this parameter allows the scheduler GDI event acknowledge 
> timeout to be manually configured to a
> < specific value. Currently the default value is set to 10 minutes. 
> The \fISCHEDULER_TIMEOUT\fP value is
> < specified in seconds.
> ---
>  > Setting this parameter allows the scheduler GDI event acknowledge 
> timeout to be manually configured to a
>  > specific value. Currently the default value is 10 minutes with 
> default scheduler configuration, capped
>  > between 600 and 1200 seconds. But the default value depends on 
> current scheduler configuration. The
>  > The \fISCHEDULER_TIMEOUT\fP value is specified in seconds.

reworded a few things:

Setting this parameter allows the scheduler GDI event acknowledge 
timeout to be manually configured to a
specific value. Currently the default value is 10 minutes with the 
default scheduler configuration and limited between 600 and 1200 
seconds. The default value depends on the current scheduler 
configuration. The \fISCHEDULER_TIMEOUT\fP value is specified in seconds.


But this CR is still not very clear for me. In the description it states 
that the timeout value is capped between 600 and 1200, yet in the 
comments it is written that the timeout is not restricted between 600 
and 1200.

Which one is correct??

>
> 2)
> CR 6786258 Man page should mention reprioritize_interval is coupled to 
> scheduler_interval
> -man page sched_conf.5
>
> cvs diff sched_conf.5
> Index: sched_conf.5
> ===================================================================
> RCS file: /cvs/gridengine/doc/man/man5/sched_conf.5,v
> retrieving revision 1.38
> diff -r1.38 sched_conf.5
> 410a411,417
>  > The reprioritization tickets are calculated by the scheduler and 
> update events
>  > for running jobs are only sent after the scheduler calculated new 
> values. How often
>  > the schedule should calculate the tickets is defined by 
> reprioritize_interval.
>  > Because the scheduler is only triggered in a specific interval 
> (scheduler_interval)
>  > this means the reprioritize_interval has only a meaning if set 
> greater than scheduler_interval.
>  > For example, if the scheduler_interval is 2 minutes and 
> reprioritize_interval is set
>  > to 10 seconds, this means the jobs get re-prioritized every 2 minutes.
>
"The" is missing in a few places...

The reprioritization tickets are calculated by the scheduler and update 
events
for running jobs are only sent after the scheduler calculated new 
values. How often
the schedule should calculate the tickets is defined by the 
reprioritize_interval.
Because the scheduler is only triggered in a specific interval 
(scheduler_interval)
this means the reprioritize_interval has only a meaning if set greater 
than the scheduler_interval.
For example, if the scheduler_interval is 2 minutes and 
reprioritize_interval is set
to 10 seconds, this means the jobs get re-prioritized every 2 minutes.

> 3)
> CR 6291037 Relationship between suspend_threshold and 
> scheduler_interval needs to be documented
> -man page queue_conf.5
>
> cvs diff queue_conf.5
> Index: queue_conf.5
> ===================================================================
> RCS file: /cvs/gridengine/doc/man/man5/queue_conf.5,v
> retrieving revision 1.33
> diff -r1.33 queue_conf.5
> 151c151,156
> < jobs which are suspended.
> ---
>  > jobs which are suspended. There is an important relationship between
>  > \fsuspend_threshold\fP and \fscheduler_interval\fP. If you have for 
> example
>  > a suspend threshold on the np_load_avg, and the load exceeds the 
> threshold,
>  > this does not have immediate effect. Jobs continue running until 
> the next
>  > scheduling run, where scheduler detects the threshold has been 
> exceeded and
>  > sends an order to qmaster to suspend the job. Same for unsuspending 
> again.
>
just a few minor changes:


jobs which are suspended. There is an important relationship between the
\fsuspend_threshold\fP and the \fscheduler_interval\fP. If you have for 
example
a suspend threshold on the np_load_avg, and the load exceeds the threshold,
this does not have immediate effect. Jobs continue running until the next
scheduling run, where the scheduler detects the threshold has been 
exceeded and
sends an order to qmaster to suspend the job. The same applies for 
unsuspending.

------------------------------------------------------
http://gridengine.sunsource.net/ds/viewMessage.do?dsForumId=39&dsMessageId=206067

To unsubscribe from this discussion, e-mail: [dev-unsubscribe at gridengine.sunsource.net].



More information about the gridengine-users mailing list