[GE users] access denied (client IP resolved to host name "". This is not identical to clients host name "")

Christian.Reissmann at Sun.COM Christian.Reissmann at Sun.COM
Mon Jun 19 11:03:39 BST 2006


Hi John,

Thanks for your update I'm pleased to hear that the problem is gone now ;-)

Christian

John Saalwaechter wrote On 06/15/06 15:58,:
> I know this is a bit of an old thread, but we
> finally found the culprit that was causing this
> problem in our environment, so I wanted to share.
> 
> Basically we were being impacted by the bug noted in
> http://gridengine.sunsource.net/issues/show_bug.cgi?id=1661
> (we're running 6.0u4).  But we couldn't figure out where
> SGE commands were being run using IP addresses instead of
> hostnames.  The answer turned out to be Nagios.  We were
> using qping in a Nagios check to verify that the qmaster
> was up.  Our Nagios command originally looked like this:
> 
>    /usr/lib/nagios/plugins/check_qmaster -H $HOSTADDRESS$
> 
> It was just arbitrary that we were checking using the
> IP address instead of the hostname.  Once we connected
> the SGE problems with this Nagios check, we changed the
> Nagios command to:
> 
>    /usr/lib/nagios/plugins/check_qmaster -H $HOSTNAME$
> 
> Now the problem is gone for us.
> 
> John
> 
> --- John Saalwaechter <johnsaalwaechter at yahoo.com> wrote:
> 
> 
>>For what it's worth, my qmaster system has this same problem
>>all the time.  I'd say that more than 50% of the time any
>>SGE command run from the qmaster results in the error message.
>>This problem only happens on our qmaster, so I've worked around
>>it by always using another host to do SGE admin work.
>>
>>Of note is the fact that this is a SPARC V880 running Solaris 9
>>and N1GE 6.0u4.  Like Chris, I've checked and rechecked all
>>DNS and /etc/hosts entries, but I cannot find any problems there.
>>
>>Chris -- can you explain in more detail your comments below
>>about /etc/hosts?  My system is not behind any private network,
>>but we do have a private link on this host for NFS connectivity
>>to $SGE_ROOT.
>>
>>Also, when I get the error, it's also accompanied by this:
>>ERROR: unable to contact qmaster using port 537 on host "xxx"
>>
>>John
>>
>>--- Chris Dagdigian <dag at sonsorol.org> wrote:
>>
>>
>>>I got lucky today.
>>>
>>>For the first time ever on a non-Apple OS X system I was able to  
>>>recreate the mysterious
>>>
>>>  access denied (client IP resolved to host name "". This is not  
>>>identical to clients host name "")
>>>
>>>... error
>>>
>>>To further make things more fun, the error condition also produces 
>>
>>>another bug-worthy case of non-compliant XML output, the empty "<>"
>>
>> 
>>
>>>tags break automated XML parsers.
>>>
>>>Check this out:
>>>
>>>
>>>>[dag at test xmlqstat]$ qstat -f -xml -j 1
>>>>error: commlib error: access denied (client IP resolved to host  
>>>>name "". This is not identical to clients host name "")
>>>><?xml version='1.0'?>
>>>><comunication_error 
>>
>>xmlns:xsd="http://www.w3.org/2001/XMLSchema">
>>
>>>>  <>
>>>>    <AN_status>11</AN_status>
>>>>    <AN_text>unable to contact qmaster using port 701 on host  
>>>>"test.gridengine.info"</AN_text>
>>>>    <AN_quality>0</AN_quality>
>>>>  </>
>>>></comunication_error>
>>>>*** glibc detected *** double free or corruption (fasttop):  
>>>>0x0000000040254440 ***
>>>>Aborted
>>>>[dag at test xmlqstat]$
>>>
>>>This was in the qmaster messages spool file:
>>>
>>>>05/04/2006 17:39:43|qmaster|test|E|commlib error: local host name
>>
>> 
>>
>>>>error (can't resolve client IP address)
>>>
>>>This is on a single CPU Opteron system running Centos 4 and SGE  
>>>courtesy binaries downloaded about 30 minutes ago (SGE 6.0u7)
>>>
>>>This system has good DNS and working utilbin/ binaries but it did
>>
>>not
>>
>>> 
>>>have an entry in /etc/hosts with the public IP and fully qualified 
>>
>>>hostname.
>>>
>>>Shortly after making the /etc/hosts entry the problem went away.
>>>
>>>In my experience with this error in the past, its always been a  
>>>transient "comes and goes" issue. I'm hoping the /etc/hosts
>>
>>addition 
>>
>>>resolved the problem but it would also be nice if it does not since
>>
>> 
>>
>>>this is a testing box that I can use for further tracing and  
>>>debugging if needed. I'm also going to see if I can find the bits
>>
>>of 
>>
>>>source code that may be producing the bad XML output for this error
>>
>> 
>>
>>>condition.
>>>
>>>-Chris
>>>
>>>
>>>
>>>
>>
>>---------------------------------------------------------------------
>>
>>>To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
>>>For additional commands, e-mail:
>>
>>users-help at gridengine.sunsource.net
>>
>>>
>>
>>--
>>johnsaalwaechter at yahoo.com
>>
>>__________________________________________________
>>Do You Yahoo!?
>>Tired of spam?  Yahoo! Mail has the best spam protection around 
>>http://mail.yahoo.com 
>>
> 
> 
> 
> __________________________________________________
> Do You Yahoo!?
> Tired of spam?  Yahoo! Mail has the best spam protection around 
> http://mail.yahoo.com 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
> For additional commands, e-mail: users-help at gridengine.sunsource.net
> 

-- 
Christian Reissmann    Tel: +49 (0)941 3075 112  mailto:crei at sun.com
Software Engineer      Fax: +49 (0)941 3075 222  http://www.sun.com/gridengine
Sun Microsystems GmbH, Dr.-Leo-Ritter-Str. 7,
D-93049 Regensburg,    Tel: +49 (0)941 3075 0

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
For additional commands, e-mail: users-help at gridengine.sunsource.net




More information about the gridengine-users mailing list