[GE users] reresolve hostname failed: can't resolve hostname with SGE 6.1

Reuti reuti at staff.uni-marburg.de
Mon Jun 2 14:08:47 BST 2008


Am 02.06.2008 um 15:06 schrieb Sam Skipsey:

> Reuti wrote:
>> Hi,
>> Am 02.06.2008 um 14:50 schrieb Sam Skipsey:
>>> I have the following problem:
>>>
>>> We have a cluster, most of which is running on RHEL4 64bit,  
>>> including the qmaster, and have lx26-amd64 SGE installed.
>>> We also have a couple of nodes which have to run RHEL3 32bit for  
>>> compatibility reasons. These, of course, run the relevant release  
>>> of SGE, but for lx24_x86. They are configured as submit hosts.
>>>
>>> This setup worked perfectly with SGE 6.0.
>>>
>>> Recently, we upgraded to SGE 6.1, and also introduced a shadow  
>>> host for more stability - the new qmaster and shadow are on  
>>> different machines to the old 6.0 qmaster (which is now removed).
>>>
>>> On the lx24_x86 nodes, we now get:
>>>
>>> # qstat -f
>>> reresolve hostname failed: can't resolve host name
>>>
>>> and the same for other SGE commands.
>>>
>>> Interestingly,
>>>
>>> $SGE_ROOT/default/common/act_qmaster contains the hostname of the  
>>> qmaster (we checked)
>>> and
>>> $SGE_ROOT/utilbin/lx24_x86/gethostbyname <hostname of qmaster>
>>> returns the correct IP
>>> similarly,
>>> $SGE_ROOT/utilbin/lx24_x86/gethostbyaddr <IP of qmaster>
>>> returns the correct hostname.
>>>
>>> Does anyone have any ideas, since we appear to have ruled out the  
>>> obvious network issues?
>> just for curiosity: what is the entry in /etc/nsswitch.conf? Ping  
>> and all othet stuff is working?
>
> Ping and all that stuff was working, yes.
>
>> Any orphaned entry in $SGE_ROOT/default/common/host_aliases?
>
> No - actually, someone's random fiddling fixed matters just after I  
> sent this mail.
>
> Our RHEL3 boxes have domain names of the form  
> name1.name2.domainname (that is, their "hostname" is actually in a  
> subdomain of the cluster domain). Originally, our /etc/hosts had a  
> line
>
> <routable IP>  name1.name2.domainname
>
> which worked for SGE 6.0
>
> For 6.1, it seems we need
>
> <routable IP>  name1.name2.domainname name1
>
> We have no idea why this works, but it does. (Thanks for the  
> suggestions, though.)

During installation yof SGE ou can configure SGE to use the FQDN or  
only the hostname. Maybe it's related to this.

-- Reuti

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
For additional commands, e-mail: users-help at gridengine.sunsource.net




More information about the gridengine-users mailing list