[GE users] master configuration - timeout when exec host freezes

crei crei at sun.com
Wed Nov 11 12:55:10 GMT 2009


Hi Matthias,

- Can you please tell what values are configured for "load_report_time and" "max_unheard"?

- Can you please also describe how you evoke the "freeze" of your execd host?

Thanks,

Christian


On 11/11/09 11:02, madpower wrote:
> hi,
> 
>> please check the entries "reschedule_unknown" and "max_unheard" in  
>> `man sge_conf`.
> thanks for this indication. The reschedule_unknown parameter works as expected/wished but the max_unheard is somehow disregarded.
> In fact, I could observe the following behavior:
> *) if max_unheard is set to a smaller value the load_report_time then after about 20 minutes having this setting the master recognizes that it does not have information on the state of some execution hosts, which is updated as soon as the next load report is sent.
> *) if max_unheard is set to a value larger than load_report_time it takes approx. 20-30 minutes until the master recognizes that an execution host is unavailable.
> 
> Does anyone have an idea what's going wrong here? Or did anyone already experienced a similar behavior?
> 
> br,
> Matthias
> 
> ------------------------------------------------------
> http://gridengine.sunsource.net/ds/viewMessage.do?dsForumId=38&dsMessageId=226133
> 
> To unsubscribe from this discussion, e-mail: [users-unsubscribe at gridengine.sunsource.net].

-- 
Sun Microsystems GmbH             Christian Reissmann
Dr.-Leo-Ritter-Str. 7             Software Engineer
D-93049 Regensburg                Phone: +49 (0)941 3075 112
Germany                           Fax:   +49 (0)941 3075 222
http://www.sun.de                 mailto: Christian.Reissmann at sun.com
                                   http://www.sun.com/gridengine
Sitz der Gesellschaft:
Sun Microsystems GmbH, Sonnenallee 1, D-85551 Kirchheim-Heimstetten
Amtsgericht Muenchen: HRB 161028
Geschaeftsfuehrer: Thomas Schroeder, Wolfgang Engels, Wolf Frenkel
Vorsitzender des Aufsichtsrates: Martin Haering

------------------------------------------------------
http://gridengine.sunsource.net/ds/viewMessage.do?dsForumId=38&dsMessageId=226161

To unsubscribe from this discussion, e-mail: [users-unsubscribe at gridengine.sunsource.net].



More information about the gridengine-users mailing list