[GE users] How to clear internal hostname cache?

Kim Leng Goh kimleng.goh at gmail.com
Tue Mar 21 11:55:10 GMT 2006


    [ The following text is in the "ISO-8859-1" character set. ]
    [ Your display is set for the "ISO-8859-10" character set.  ]
    [ Some special characters may be displayed incorrectly. ]

Hi Andy,

output as follows:

[root at compute-0-7 lx24-amd64]# ./gethostbyname compute-0-7.local
Hostname: compute-0-7.local
Aliases:  compute-0-7
Host Address(es): 10.255.255.247

[root at compute-0-7 lx24-amd64]# ./gethostbyname compute-0-7
Hostname: compute-0-7.local
Aliases:  compute-0-7
Host Address(es): 10.255.255.247

[root at compute-0-7 lx24-amd64]# ./gethostbyaddr 10.255.255.247
Hostname: compute-0-7.local
Aliases:  compute-0-7
Host Address(es): 10.255.255.247


In case you are wondering how "network-0-0" came into the picture:
https://lists.sdsc.edu/pipermail/npaci-rocks-discussion/2006-March/017441.html

Thread: https://lists.sdsc.edu/pipermail/npaci-rocks-discussion/2006-March/thread.html#17441


I guess "network-0-0" is defined somewhere but I'm out of my wits.


Thanks,
KL


On 3/21/06, Andy Schwierskott <andy.schwierskott at sun.com> wrote:
> Kim Leng,
>
> qmaster sees the execution host 10.255.255.247 as host "network-0-0.local".
>
> The reason can be errors in the hostname resolving as Chris wrote or the
> execution host has several network interfaces.
>
> What's the oputput on qmaster host when you enter:
>
>     <sge-root>/utilbin/<arch>/gethostbyname compute-0-7.local
>     <sge-root>/utilbin/<arch>/gethostbyname compute-0-7
>     <sge-root>/utilbin/<arch>/gethostbyaddr 10.255.255.247
>
> Andy
>
>
>
> On Tue, 21 Mar 2006, Kim Leng Goh wrote:
>
> > Hi Andy,
> >  I do not have "127.0.0.1   localhost  compute-0-7.local" but:
> >
> > [root at compute-0-7 root]# head -5 /etc/hosts
> > # Do not remove the following line, or various programs
> > # that require network functionality will fail.
> > 127.0.0.1 localhost.localdomain localhost
> > 172.18.36.248 frontend.foo.com
> > 10.255.255.247  compute-0-7.local  compute-0-7
> >
> >
> > I changed the last line above to "10.255.255.247  compute-0-7" and
> > "qstat -f" still returns:
> >
> > [root at compute-0-7 root]# qstat -f
> > denied: host "network-0-0.local" is neither submit nor admin host
> >
> >
> > Thanks,
> > KL
> >
> > On 3/21/06, Andy Schwierskott <andy.schwierskott at sun.com> wrote:
> >> Hi,
> >>
> >> the message
> >>
> >>>> This host has the local hostname >compute-0-7.local<.
> >>
> >> indicates the in /etc/hosts the actual hostname as an alias for
> >>
> >>    127.0.0.1   localhost  compute-0-7.local
> >>
> >> as this happens in some Linux distributions.
> >>
> >> Delete "compute-0-7.local" from that line and any other names but
> >> "localhost" and you'll be fine regarding this error.
> >>
> >> Andy
> >>
> >>
> >>
> >> On Tue, 21 Mar 2006, Chris Dagdigian wrote:
> >>
> >>>
> >>> I'm willing to bet that this hostname is defined somewhere on your system,
> >>> I've wrestled with SGE hostname resolution issues on many clusters and in
> >>> many complicated network, hostname and DNS resolving environments and the
> >>> root cause for name issues was *always* external and not within SGE.
> >>>
> >>> I've also not seen caching activity do anything significant when making
> >>> changes -- when I've fixed DNS or nameservice mistakes they are quickly
> >>> picked up by SGE.
> >>>
> >>> You did not mention testing with the "gethostname" and "gethostbbyaddr" and
> >>> the other utility binaries that should be in /opt/gridengine/utilbin/<arch>
> >>> on your system. Try running those directly to see what SGE sees. After that,
> >>> carefully make sure that what is in /etc/hosts matches what is being returned
> >>> by forward and reverse DNS. Depending on your operating system there can also
> >>> be other files and locations where hardcoded hostnames may be laying around.
> >>>
> >>>
> >>> -Chris
> >>>
> >>>
> >>>
> >>>
> >>> On Mar 21, 2006, at 4:46 AM, Kim Leng Goh wrote:
> >>>
> >>>> Hi Christian,
> >>>>   Thanks for the speedy reply.
> >>>>
> >>>> On 3/21/06, christian reissmann <Christian.Reissmann at sun.com> wrote:
> >>>> [...]
> >>>>>
> >>>>> The cl_commlib.c module was developed for 6.0! The 5.3p6 version uses
> >>>>> sge_commd to resolve hostnames and has no cache at all.
> >>>>> So I don't understand the question.
> >>>> [...]
> >>>>
> >>>> My problem is that SGE seems to think that my compute-0-7 node has the
> >>>> hostname "network-0-0.local" when in fact it isn't (which prompted me
> >>>> to think that this was in some cache somewhere or stored somewhere
> >>>> else):
> >>>>
> >>>> [root at compute-0-7 root]# qstat -f
> >>>> denied: host "network-0-0.local" is neither submit nor admin host
> >>>>
> >>>>
> >>>> Reinstalling sge on the compute node or reinstalling the compute node
> >>>> doesn't seem to help:
> >>>>
> >>>>
> >>>> [root at compute-0-7 gridengine]# ./install_execd -auto
> >>>>
> >>>> Confirm Grid Engine default installation settings
> >>>> -------------------------------------------------
> >>>>
> >>>> The following default settings can be used for an accelerated
> >>>> installation procedure:
> >>>>
> >>>>       $SGE_ROOT          = /opt/gridengine
> >>>>       service            = sge_commd
> >>>>       admin user account = sge
> >>>>
> >>>> Do you want to use these configuration parameters (y/n) [y] >>
> >>>> denied: host "network-0-0.local" is neither submit nor admin host
> >>>>
> >>>>
> >>>>
> >>>> Checking hostname resolving
> >>>> ---------------------------
> >>>> denied: host "network-0-0.local" is neither submit nor admin host
> >>>>
> >>>> denied: host "network-0-0.local" is neither submit nor admin host
> >>>>
> >>>>
> >>>> This host has the local hostname >compute-0-7.local<.
> >>>>
> >>>> This host is unknown on the qmaster host.
> >>>>
> >>>> Please make sure that you added this host as administrative host!
> >>>> If you did not, please add this host now with the command
> >>>>
> >>>>    # qconf -ah HOSTNAME
> >>>>
> >>>> on your qmaster host.
> >>>>
> >>>> Check again (y/n) [y] >>
> >>>>
> >>>> Checking hostname resolving
> >>>> ---------------------------
> >>>> denied: host "network-0-0.local" is neither submit nor admin host
> >>>>
> >>>> denied: host "network-0-0.local" is neither submit nor admin host
> >>>>
> >>>>
> >>>> This host has the local hostname >compute-0-7.local<.
> >>>>
> >>>> This host is unknown on the qmaster host.
> >>>>
> >>>> Please make sure that you added this host as administrative host!
> >>>> If you did not, please add this host now with the command
> >>>>
> >>>>    # qconf -ah HOSTNAME
> >>>>
> >>>> on your qmaster host.
> >>>>
> >>>> If this host is already added as administrative host on your qmaster host
> >>>> there may be a hostname resolving problem on this machine.
> >>>>
> >>>> Please check your >/etc/hosts< file and >/etc/nsswitch.conf< file.
> >>>>
> >>>> Hostname resolving problems will cause the problem that the
> >>>> execution host will not be accepted by qmaster. Qmaster will
> >>>> receive no load report values and show a load value
> >>>> (>load_avg<) of 99.99 for this host.
> >>>>
> >>>> ---------------------------------------------------------------------
> >>>> To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
> >>>> For additional commands, e-mail: users-help at gridengine.sunsource.net
> >>>>
> >>>
> >>> ---------------------------------------------------------------------
> >>> To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
> >>> For additional commands, e-mail: users-help at gridengine.sunsource.net
> >>>
> >>>
> >>
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
> > For additional commands, e-mail: users-help at gridengine.sunsource.net
> >
> >
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
> For additional commands, e-mail: users-help at gridengine.sunsource.net
>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
For additional commands, e-mail: users-help at gridengine.sunsource.net




More information about the gridengine-users mailing list