[GE users] au state on new machine

RRay at semtech.com RRay at semtech.com
Thu Jun 5 18:32:37 BST 2008


Reuti <reuti at staff.uni-marburg.de> wrote on 06/05/2008 11:44:06 AM:

> Hi,
> 
> Am 05.06.2008 um 17:21 schrieb RRay at semtech.com:
> 
> > We just installed three machines using the same install procedure 
> > for all three.  Only one of them comes up ok, the other two have au 
> > in the status for qstat -f.  All three machines are identical 
> > (CentOS 4.6 64bit, all updates applied).  I can't seem to get rid 
> > of the au state.
> >
> > qstat -explain a returns
> > all.q at us02farm6.semnet.dom     BIP   0/2       -NA-     - 
> > NA-          au
> >        error: no value for "np_load_avg" because execd is in 
> > unknown state
> >
> >
> > 1.  I have stopped and restarted sgeexecd several times.
> > 2.  I have rebooted the machine and no luck
> > 3.  I've checked /etc/services to make sure the correct ports are 
> > listed
> > 4.  I've scratched my head several times in puzzlement
> >
> > Anyone know what is wrong and how to fix it?  What haven't I done?
> 
> a) you can ping the execd machines from the qmaster machine, i.e. the 
> TCP/IP addresses are conform on all machines?

ping works to all machines

> 
> b) no firewall ist blocking SGE's ports (new distributions use 
> 6444/6445)

No firewall and ports available
> 
> c) one machine has the qmaster and execd, the other two only the execd?

Only one qmaster, all others execd.


We found the problem in the /etc/hosts file.  The localhost had
127.0.0.1       localhost       us02farm6
192.168.170.31  us02farm6

After changing the first line to
127.0.0.1       localhost

everything is working now.


> 
> -- Reuti
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
> For additional commands, e-mail: users-help at gridengine.sunsource.net
> 



More information about the gridengine-users mailing list