[GE users] Help with error messages (better formatted)

McCalla, Mac macmccalla at hess.com
Fri May 20 01:31:20 BST 2005


Hi Viktor,

Are the sge_execd's running on your compute nodes?  Are there any
messages in their messages files?  What happens when you stop/start one
of the sge_execd's? You could try a qping command
from one of your compute nodes back to the qmaster to see if the port
assignments are correct
in your environment.  It looks like the
scheduler did not start at all this time when you restarted the qmaster.
any error messages in its messages file? 

mac mccalla
 

-----Original Message-----
From: Viktor Oudovenko [mailto:udo at physics.rutgers.edu] 
Sent: 19 May 2005 17:59
To: users at gridengine.sunsource.net
Subject: [GE users] Help with error messages (better formatted)



Hi, I just repyped my previous E-mail with better formatting:

++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
++++
++++++

05/19/2005 18:40:09|qmaster|rupc-cs04b|I|read job database with 24
entries
in 0 seconds

05/19/2005 18:40:09|qmaster|rupc-cs04b|W|received unkown event: 5

05/19/2005 18:40:09|qmaster|rupc-cs04b|W|received unkown event: 5

........................................................................
....
....

05/19/2005 18:40:09|qmaster|rupc-cs04b|W|received unkown event: 5

05/19/2005 18:40:09|qmaster|rupc-cs04b|W|received unkown event: 5

05/19/2005 18:40:09|qmaster|rupc-cs04b|W|received unkown event: 5

05/19/2005 18:40:09|qmaster|rupc-cs04b|I|qmaster will use max. 1004 file
descriptors for communication

05/19/2005 18:40:09|qmaster|rupc-cs04b|I|qmaster will accept max. 99
dynamic
event clients

05/19/2005 18:40:09|qmaster|rupc-cs04b|E|no execd known on host
rupc01.rutgers.edu to send conf notification

05/19/2005 18:40:09|qmaster|rupc-cs04b|E|no execd known on host
rupc02.rutgers.edu to send conf notification

05/19/2005 18:40:09|qmaster|rupc-cs04b|E|no execd known on host
sub04n101 to
send conf notification
...............................................

05/19/2005 18:40:09|qmaster|rupc-cs04b|E|no execd known on host sub04n91
to
send conf notification

05/19/2005 18:40:09|qmaster|rupc-cs04b|E|no execd known on host
rupc04.rutgers.edu to send conf notification

05/19/2005 18:40:09|qmaster|rupc-cs04b|I|starting up 6.0u3

05/19/2005 18:40:10|qmaster|rupc-cs04b|E|no event client known with id 1
to
modify

++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
++++
++++++

Thank you for your help,
v

> -----Original Message-----
> From: Viktor Oudovenko [mailto:udo at physics.rutgers.edu] 
> Sent: Thursday, May 19, 2005 18:52
> To: users at gridengine.sunsource.net
> Subject: [GE users] Help with error messages
> 
> 
> Hello to everybody,
> 
> Does anybody know what mean those errors and how to set rid of them?
> file: /opt/SGE/default/spool/qmaster/messages
> 
> I restart sgemaster:
> 
> 05/19/2005 18:40:09|qmaster|rupc-cs04b|I|read job database 
> with 24 entries in 0 seconds 05/19/2005 
> 18:40:09|qmaster|rupc-cs04b|W|received unkown event: 5 
> 05/19/2005 18:40:09|qmaster|rupc-cs04b|W|received unkown 
> event: 5 05/19/2005 18:40:09|qmaster|rupc-cs04b|W|received 
> unkown event: 5 05/19/2005 
> 18:40:09|qmaster|rupc-cs04b|W|received unkown event: 5
> 
> ..............................................................
> ..............
> .........................................
> MANY MESSAGES LIKE THOSE ONES (probably as many as number of 
> hosts 
> ..............................................................
> ..............
> .........................................
> 
> 05/19/2005 18:40:09|qmaster|rupc-cs04b|W|received unkown 
> event: 5 05/19/2005 18:40:09|qmaster|rupc-cs04b|W|received 
> unkown event: 5 05/19/2005 
> 18:40:09|qmaster|rupc-cs04b|W|removing reference to no longer 
> existing job 19881 of user "udo" 05/19/2005 
> 18:40:09|qmaster|rupc-cs04b|W|received unkown event: 5 
> 05/19/2005 18:40:09|qmaster|rupc-cs04b|I|qmaster will use 
> max. 1004 file descriptors for communication 05/19/2005 
> 18:40:09|qmaster|rupc-cs04b|I|qmaster will accept max. 99 
> dynamic event clients 05/19/2005 
> 18:40:09|qmaster|rupc-cs04b|E|no execd known on host 
> sub04n101 to send conf notification 05/19/2005 
> 18:40:09|qmaster|rupc-cs04b|E|no execd known on host 
> sub04n102 to send conf notification 05/19/2005 
> 18:40:09|qmaster|rupc-cs04b|E|no execd known on host 
> sub04n103 to send conf notification 
> .....................................................
> 
> 05/19/2005 18:40:09|qmaster|rupc-cs04b|E|no execd known on 
> host sub04n90 to send conf notification 05/19/2005 
> 18:40:09|qmaster|rupc-cs04b|E|no execd known on host sub04n91 
> to send conf notification 05/19/2005 
> 18:40:09|qmaster|rupc-cs04b|E|no execd known on host 
> rupc04.rutgers.edu to send conf notification 05/19/2005 
> 18:40:09|qmaster|rupc-cs04b|I|starting up 6.0u3 05/19/2005 
> 18:40:10|qmaster|rupc-cs04b|E|no event client known with id 1 
> to modify
> 
> Thank you very much for your help, comments etc.
> Regards,
> Viktor
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
> For additional commands, e-mail: users-help at gridengine.sunsource.net
> 


---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
For additional commands, e-mail: users-help at gridengine.sunsource.net



---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
For additional commands, e-mail: users-help at gridengine.sunsource.net




More information about the gridengine-users mailing list