[GE users] qstat communication problems

reuti reuti at staff.uni-marburg.de
Fri Jan 8 14:56:54 GMT 2010


Am 08.01.2010 um 15:20 schrieb russray:

> I think I spoke too soon.  The reboot did work right after the  
> machine booted, but this morning I'm on the machine again and  
> getting the same errors.
>
> error: commlib error: access denied (client IP resolved to host  
> name "". This is not identical to clients host name "")
> error: unable to contact qmaster using port 6444 on host  
> "us03grid1.semnet.dom"
>
> I tried telnet us03grid1 6444 and it get the following which makes  
> me think it really is listening and netstat on the qmaster reports  
> it is listening:
>
> telnet us03grid1 6444
> Trying 192.168.201.11...

The Qmaster has also an external interface which is the primary one?

http://gridengine.sunsource.net/ds/viewMessage.do? 
dsForumId=38&dsMessageId=237342

-- Reuti


> Connected to us03grid1.semnet.dom (192.168.201.11).
> Escape character is '^]'.
> Connection closed by foreign host.
>
> On another cluster that isn't having these problems the qmaster is  
> also listed as an execution host, but not included in any of the  
> queues, does the qmaster need to also be an execution host?
>
> I'm really baffled by this behavior.
>
>
>
>
> russray <rray at semtech.com> wrote on 01/07/2010 03:48:42 PM:
>
> >
> > Well, a reboot of my qmaster server seems to have fixed the problem.
> > Still not sure what happened, but life is good again.
> >
> >
> >
> > russray <rray at semtech.com>
> > 01/07/2010 03:06 PM
> >
> >
> > users at gridengine.sunsource.net
> >
> > Subject
> >
> > Re: [GE users] qstat communication problems
> >
> >
> >
> >
> >
> > Finally getting back to this after the holidays.  No iptables on
> > either the qmaster, executable, or submission machines.
> >
> > reuti <reuti at staff.uni-marburg.de> wrote on 12/18/2009 06:48:47 PM:
> >
> > > Am 18.12.2009 um 23:04 schrieb russray:
> > >
> > > >
> > > > I've had a small farm running for several months now, but after
> > > > what I think was  a series of yum updates, my submission  
> nodes can
> > > > no longer talk to the qmaster.  When I type qstat, I get the
> > > > following:
> > > >
> > > > error: commlib error: access denied (client IP resolved to host
> > > > name "". This is not identical to clients host name "")
> > > > error: unable to contact qmaster using port 6444 on host
> > > > "us03grid1.semnet.dom"
> > >
> > > Any firewall suddenly in place and/or changing the allowed ports?
> > >
> > > -- Reuti
> > >
> > >
> > > > I can ping and ssh to us03grid1, so that communcation seems  
> ok.  If
> > > > I use gethostbyname, gethostbyaddr, gethostname, I get the
> > > > following from the machines in question (using one of them as an
> > > > example):
> > > >
> > > > $SGE_ROOT/utilbin/lx24-x86/gethostbyname us03linux1
> > > > Hostname: us03linux1.semnet.dom
> > > > Aliases:  us03linux1
> > > > Host Address(es): 192.168.201.61
> > > >
> > > > $SGE_ROOT/utilbin/lx24-x86/gethostbyaddr 192.168.201.61
> > > > Hostname: us03linux1.semnet.dom
> > > > Aliases:  us03linux1
> > > > Host Address(es): 192.168.201.61
> > > >
> > > > $SGE_ROOT/utilbin/lx24-x86/gethostname
> > > > Hostname: us03linux1.semnet.dom
> > > > Aliases:  us03linux1
> > > > Host Address(es): 192.168.201.61
> > > >
> > > > $SGE_ROOT/utilbin/lx24-x86/gethostbyname us03grid1
> > > > Hostname: us03grid1.semnet.dom
> > > > Aliases:
> > > > Host Address(es): 192.168.201.11
> > > >
> > > > Any clues on what changed to cause this and how to fix?
> > > >
> > > >
> > > > Russell Ray
> > > > rray at semtech.com

------------------------------------------------------
http://gridengine.sunsource.net/ds/viewMessage.do?dsForumId=38&dsMessageId=237384

To unsubscribe from this discussion, e-mail: [users-unsubscribe at gridengine.sunsource.net].



More information about the gridengine-users mailing list