[GE users] problems with reaching hosts

Reuti reuti at staff.uni-marburg.de
Fri Oct 17 16:35:47 BST 2008


Hi,

Am 17.10.2008 um 11:14 schrieb k.radacki:

> Dear All,
> after some 8 faithful years of work it came time, that I have to  
> exchange
> our queue-server (old dual P3-550 board get into pension scheme :-)
> The computation nodes stayed more or less the same. I've installed  
> SGE 6.2.
> Unfortunately the executions hosts were unavailable for computations.
>
> qstat -f
> all.q at hall13.khazad.dum        BIP   0/0/4          -NA-     - 
> NA-          a
> all.q at hall14.khazad.dum        BIP   0/0/4          -NA-     - 
> NA-          a
>
> In .../spool/../messages I've found
> local configuration localhost.localdomain not defined - using global
> configuration
> main|hall13|I|starting up SGE 6.2 (lx24-amd64)
> main|hall13|E|can't connect to service
> main|hall13|E|can't get configuration from qmaster -- backgrounding
>
> The hostname on all nodes gives "proper" name
> [root at hall13 ~]# hostname
> hall13.khazad.dum
>
> now I commented in /etc/hosts
> # 127.0.0.1     localhost.localdomain   localhost
> and queue works
> all.q at hall13.khazad.dum        BIP   0/0/4          0.10     lx24- 
> amd64
> all.q at hall14.khazad.dum        BIP   0/0/4          0.00     lx24- 
> amd64
>
> Can somebody explain me why SGE uses "wrong" host name  and what  
> should
> I do
> to correct this behaviour?
> I'm not that happy with commenting localhost in /etc/hosts file.
> Who knows what under problems I will get with network services.

the loopback device is often used and removing it might lead to weird  
behavior I fear. Did you check before with the utilities programs in  
$SGE_ROOT/utilbin/$ARC like gethostbyaddr et al.?

The question is more: why is SGE thinking, that the name of the  
machine is localhost.localdomain at all. Were the nodes newly  
installed? Maybe SGE is started before the network, and as no NIS  
answer is availble, so it uses localhost. I put the SGE startup  
always at the end of the startup. What is the order in /etc/ 
nsswitch.conf to check local files?

-- Reuti


> have a nice weekend
> Kris
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
> For additional commands, e-mail: users-help at gridengine.sunsource.net
>


---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
For additional commands, e-mail: users-help at gridengine.sunsource.net




More information about the gridengine-users mailing list