[GE users] commlib error problem

reuti reuti at staff.uni-marburg.de
Thu Oct 28 12:32:24 BST 2010


    [ The following text is in the "utf-8" character set. ]
    [ Your display is set for the "ISO-8859-10" character set.  ]
    [ Some characters may be displayed incorrectly. ]

Hi,

Am 27.10.2010 um 16:41 schrieb gqc606:

> Hi:
>  I'm running SGE6.2up4 on a rocks cluster 5.3 .I?m having some intermittent issues where I get the following errors:
> 
> [gqc606 at cluster test]$ qstat -f
> error: commlib error: got select error (Connection refused)
> error: unable to send message to qmaster using port 536 on host "cluster.local": got send error

this looks like the `qmaster` disappeared. You can start it by hand by executing:

$ /etc/init.d/sgemaster.p6444 start

(adjust path to your installation). The question is more, why is it happening. Maybe you are affected by crashes on some systems:

http://gridengine.sunsource.net/ds/viewMessage.do?dsForumId=38&dsMessageId=275696

and have to recompile SGE on your own to fix it permanently.

-- Reuti


>  At this time,I can't use these command such as "qsub" or "qconf".It seems that my SGE doesn't work.
>  Everytime I have to restart the front node to resolve this problem.I want to know is there have such command that I can reboot the SGE.Or other methods can solve my problem.Thanks!
> 
> ------------------------------------------------------
> http://gridengine.sunsource.net/ds/viewMessage.do?dsForumId=38&dsMessageId=290465
> 
> To unsubscribe from this discussion, e-mail: [users-unsubscribe at gridengine.sunsource.net].
>

------------------------------------------------------
http://gridengine.sunsource.net/ds/viewMessage.do?dsForumId=38&dsMessageId=290803

To unsubscribe from this discussion, e-mail: [users-unsubscribe at gridengine.sunsource.net].



More information about the gridengine-users mailing list