[GE users] sge_qmaster service quits very often

Sandeep, Patel(IE10) Sandeep.Patel2 at Honeywell.com
Wed Apr 16 12:43:10 BST 2008



   Ur qmaster is running or not?







From: manju a [mailto:manju.kudu at gmail.com] 
Sent: Tuesday, April 15, 2008 7:57 PM
To: users at gridengine.sunsource.net
Subject: [GE users] sge_qmaster service quits very often


Hi all,

is any body aware of the following error 


>> error: commlib error: can't connect to service (Connection refused)

>> error: unable to contact qmaster using port 534 on host 

>> "masterserver.abc.com"

i checked the service,  qmaster service was not running , i tired
restarting the service it was giving some error "unable to unpack gid,
unable to read the q master configuration "... 

in $SGE_ROOT/spool/message i can see the below message 

04/15/2008 04:41:16|qmaster|grimmcs1vl|I|starting up SGE 6.1u2
04/15/2008 15:57:57|qmaster|grimmcs1vl|E|acknowledge timeout after 600
seconds for event client (schedd:1) on host "masterserver.abc.com"
04/15/2008 16:01:49|qmaster|grimmcs1vl|E|commlib error: got read error
(closing "masterserver.abc.com/qhost/2")

all happens suddenly ,  is any body aware of this problem?? 

manjunath A

More information about the gridengine-users mailing list