[GE users] act_qmaster contents changing

robhorton r.horton at qmul.ac.uk
Fri Nov 27 09:50:35 GMT 2009


Thanks for your comments.

On Thu, 2009-11-26 at 17:27 +0100, petrik wrote:
> > After installation, act_qmaster contains "taurus.local" (which is
> > the /etc/hosts entry for the private interface). 
> You later say that /usr/local/sge/utilbin/lx24-amd64/gethostname does 
> not return this value. Correct? What does the gethostname -aname return? 
> Attach also output of gethostname -all.

Yes, I get: (<fqdn> is the real fqdn)

[root at taurus ~]# /usr/local/sge/utilbin/lx24-amd64/gethostname -aname

[root at taurus ~]# /usr/local/sge/utilbin/lx24-amd64/gethostname -all  
Hostname: taurus.<fqdn>
SGE name: taurus.<fqdn>
Host Address(es): <public ip>

> > =============================================================
> > [root at taurus ~]# /usr/local/sge/default/common/sgemaster start
> >
> > sge_qmaster didn't start!
> > This is not a qmaster host!
> > Please, check your act_qmaster file!
> >   
> This happens when gethostname -aname and act_qmaster do not match.

I can understand that but I'm not sure why the content of act_qmaster
reverts each time it gets restarted.

> Does the public and local name differ only in the domain? There've been 
> some bugfixes in this area for the upcoming 6.2u5.


Reuti's suggestion of adding a $SGE_ROOT/default/common/host_aliases
entry seems to make it behave as expected (thanks), although this
doesn't seem to have been necessary on other clusters with a similar
setup, so I'd still be interested to know what's happening.



To unsubscribe from this discussion, e-mail: [users-unsubscribe at gridengine.sunsource.net].

More information about the gridengine-users mailing list