[GE users] SGE and TruCluster

Craig Tierney ctierney at hypermall.net
Fri May 5 19:55:36 BST 2006


    [ The following text is in the "ISO-8859-1" character set. ]
    [ Your display is set for the "ISO-8859-10" character set.  ]
    [ Some special characters may be displayed incorrectly. ]

James Chamberlain wrote:
> No joy.  sge_execd is still binding to *:536, which prevents it from 
> being started on any of the other nodes in the TruCluster.  The problem 
> isn't that we want the qmaster to talk to the execd through a different 
> interface or on a different IP address, but that we need the execd to 
> not bind to *:536 at all.  If one node binds to *:536 in a TruCluster 
> environment, that port is bound to that host across the entire cluster.  
> Other nodes attempting to bind to *:536 will fail with EADDRINUSE.  The 
> only way around that I know of is to bind to the specific IP address of 
> that host.
>

Not sure if I missed something, but don't you just change the entry
in /etc/services and pick a different port?  Make sure it is consistent
on all nodes.

Craig



> Also, to note, the default rcsge script does not appear to work on Tru64 
> 5.1:
> 
> # /sbin/init.d/rcsge start
>    starting sge_execd
> SGE 5.3p7
> usage: sge_execd [options]
>    [-help]                                  print this help
>    [-lj log_file]                           write job logging to log file
>    [-nostart-commd]                         do not start commd
> 
> Thanks,
> 
> James
> 
> On Fri, 5 May 2006, Rayson Ho wrote:
> 
>> Like this:
>> http://gridengine.sunsource.net/servlets/ReadMsg?list=users&msgNo=9553
>>
>> See this thread for detail:
>> http://gridengine.sunsource.net/servlets/BrowseList?list=users&by=thread&from=2476 
>>
>>
>> Rayson
>>
>>
>>
>>
>> On 5/5/06, James Chamberlain <jamesc at exa.com> wrote:
>>> Hi all,
>>>
>>> Does anyone know of a way to make sge_execd bind to a specific 
>>> interface?
>>> By binding to *:536, sge_execd is picking up the cluster IP address in a
>>> TruCluster (Tru64 UNIX) environment.  As a result, other nodes in the
>>> TruCluster say they can't start sge_execd because port 536 is already 
>>> in use.
>>> For reference, I'm using SGE 5.3p7.
>>>
>>> Thanks,
>>>
>>> James
>>>
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
>>> For additional commands, e-mail: users-help at gridengine.sunsource.net
>>>
>>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
>> For additional commands, e-mail: users-help at gridengine.sunsource.net
>>
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
> For additional commands, e-mail: users-help at gridengine.sunsource.net
> 

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
For additional commands, e-mail: users-help at gridengine.sunsource.net




More information about the gridengine-users mailing list