[GE users] execd error on AMD64 Server

kosugi.toru at jp.fujitsu.com kosugi.toru at jp.fujitsu.com
Thu Oct 26 11:06:02 BST 2006


Hi,

Thanks for your responding.

I executed "netstat -an". 
 result, port 537 is "LISTEN" status.

Incidentally,I found following phase.
when this ploblem occurred, rslyk15(SGE Excec Host) start many paket send/recieve
 to nis+ server.

Is it related this ploblem?

 -------- it's snoop command output --------------
     rslyk15 -> NIS+server   PORTMAP C GETPORT prog=100300 (NIS+) vers=3 proto=UDP
  NIS+server -> rslyk15      PORTMAP R GETPORT port=734

 -------- netstat -an command output -------------
NIS+Server_IP address.735     rslyk15_IP address.58476   5840      0 10136      0 TIME_WAIT
NIS+Server_IP address.735     rslyk15_IP address.58480   5840      0 10136      0 TIME_WAIT
NIS+Server_IP address.735     rslyk15_IP address.58492   5840      0 10136      0 TIME_WAIT
NIS+Server_IP address.735     rslyk15_IP address.58494   5840      0 10136      0 TIME_WAIT

  <rslyk15 is nis+ client.>

Thanks,
T.Kosugi


> You can use lsof or "netstat -an" to find out if port 537 is used or is free...
> 
> Rayson
> 
> 
> 
> On 10/26/06, kosugi.toru at jp.fujitsu.com <kosugi.toru at jp.fujitsu.com> wrote:
> > Hi All,
> >
> > I am using SGE 6.0U6.
> >
> > I attached AMD64 Server(OS is Red Hat Enterprise Linux 3.0 WS) on My SGE hostgroups.
> >
> > but,this Server has following Error status.
> >
> > Please advice me,how can i repair this error.
> >
> > ---- AMD64 Server(rslyk15):/tmp/execd_messages.6105 File --------------------
> > 10/23/2006 15:55:10|execd|rslyk15|E|communication error for "rslyk15/execd/1" running on port 537: "can't bind socket"
> > 10/23/2006 15:55:11|execd|rslyk15|E|communication error for "rslyk15/execd/1" running on port 537: "can't bind socket"
> > 10/23/2006 15:55:12|execd|rslyk15|E|communication error for "rslyk15/execd/1" running on port 537: "can't bind socket"
> > 10/23/2006 15:55:13|execd|rslyk15|E|communication error for "rslyk15/execd/1" running on port 537: "can't bind socket"
> > 10/23/2006 15:55:14|execd|rslyk15|E|communication error for "rslyk15/execd/1" running on port 537: "can't bind socket"
> > 10/23/2006 15:55:15|execd|rslyk15|E|communication error for "rslyk15/execd/1" running on port 537: "can't bind socket"
> > 10/23/2006 15:55:16|execd|rslyk15|E|communication error for "rslyk15/execd/1" running on port 537: "can't bind socket"
> > 10/23/2006 15:55:17|execd|rslyk15|E|communication error for "rslyk15/execd/1" running on port 537: "can't bind socket"
> > 10/23/2006 15:55:18|execd|rslyk15|E|communication error for "rslyk15/execd/1" running on port 537: "can't bind socket"
> > 10/23/2006 15:55:19|execd|rslyk15|E|communication error for "rslyk15/execd/1" running on port 537: "can't bind socket"
> > 10/23/2006 15:55:20|execd|rslyk15|E|communication error for "rslyk15/execd/1" running on port 537: "can't bind socket"
> > 10/23/2006 15:55:21|execd|rslyk15|E|communication error for "rslyk15/execd/1" running on port 537: "can't bind socket"
> >
> > ---- $SGE_HOME/our_group/spool/rslyk15/messages File -----------------
> > 10/26/2006 10:04:31|execd|rslyk15|E|commlib error: got read error (closing "mshost/qmaster/1")
> > 10/26/2006 10:04:31|execd|rslyk15|E|commlib error: got pipe error (closing "mshost/qmaster/1")
> > 10/26/2006 10:14:31|execd|rslyk15|E|commlib error: got read error (closing "mshost/qmaster/1")
> > 10/26/2006 10:14:31|execd|rslyk15|E|commlib error: got pipe error (closing "mshost/qmaster/1")
> > 10/26/2006 10:24:31|execd|rslyk15|E|commlib error: got read error (closing "mshost/qmaster/1")
> > 10/26/2006 10:24:31|execd|rslyk15|E|commlib error: got pipe error (closing "mshost/qmaster/1")
> > 10/26/2006 10:34:31|execd|rslyk15|E|commlib error: got read error (closing "mshost/qmaster/1")
> > 10/26/2006 10:34:31|execd|rslyk15|E|commlib error: got pipe error (closing "mshost/qmaster/1")
> > 10/26/2006 10:35:11|execd|rslyk15|E|commlib error: endpoint is not unique error (endpoint "mshost/qmaster/1" is already connected)
> >
> > ---- $SGE_HOME/our_group/spool/qmaster/messags File ---------------
> > 10/26/2006 10:24:31|qmaster|mshost|E|commlib error: can't read general message size header (GMSH) (closing "rslyk15/execd/1")
> > 10/26/2006 10:34:32|qmaster|mshost|E|commlib error: can't read general message size header (GMSH) (closing "rslyk15/execd/1")
> > 10/26/2006 10:35:11|qmaster|mshost|E|commlib error: endpoint is not unique error (endpoint "mshost/qmaster/1" is already connected)
> >
> > ----------------
> > Thanks
> > T.Kosugi
> >
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
> > For additional commands, e-mail: users-help at gridengine.sunsource.net
> >
> >
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
> For additional commands, e-mail: users-help at gridengine.sunsource.net
> 


---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
For additional commands, e-mail: users-help at gridengine.sunsource.net




More information about the gridengine-users mailing list