[GE users] Qmaster stops starting jobs, nodes go "au"

Ron Chen ron_chen_123 at yahoo.com
Tue May 18 01:39:57 BST 2004


Yes, that bug was discovered by Sean. Basically, the
qmaster updated the reschedule_unknown_list very
frequently, and that caused it to be blocked by I/O.

The fix will be included in SGE5.3p7.

For now, if other people encounter that problem, you
can either disable rescheduling unknown jobs
(reschedule_unknown_list NONE), or get the latest SGE
5.3 branch from cvs, or even try to use SGE 6.0beta.

 -Ron

--- Rayson Ho <raysonho at eseenet.com> wrote:
> There was a bug in SGE 5.3 that caused the qmaster
> to rewrite the spool
> files due to the reschedule_unknown_list (issue
> 942). The fix is down and
> will be included in the next patch release of SGE
> 5.3.
> 
> Rayson
>
---------------------------------------------------------
> Get your FREE E-mail account at
> http://www.eseenet.com !
> 
>
---------------------------------------------------------------------
> To unsubscribe, e-mail:
> users-unsubscribe at gridengine.sunsource.net
> For additional commands, e-mail:
> users-help at gridengine.sunsource.net
> 



	
		
__________________________________
Do you Yahoo!?
SBC Yahoo! - Internet access at a great low price.
http://promo.yahoo.com/sbc/

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
For additional commands, e-mail: users-help at gridengine.sunsource.net




More information about the gridengine-users mailing list