[GE users] SGE 6.0 host_aliases question

John Marshall John.Marshall at ec.gc.ca
Thu Aug 12 16:09:17 BST 2004


It seems that the behavior of SGE 6.0 is
different from SGE 5.3 with regard the
host_aliases and act_qmaster.

Currently, we support a failover system
by having a dual connected disk with the
cell information stored on the disk. All
other gridengine files are local to each
machine. If the qmaster host fails, the
second takes over the files (mount) and
the ip alias.

To do this, we had the hostname associated
with the ip alias in the act_qmaster file
and the host_aliases file contained something
like this:
	qmaster.domain  hostA.domain hostB.domain

This worked well under SGE 5.3. But under
6.0, it seems that the qmaster wants to
update the act_qmaster file with the
actual/real hostname (e.g., hostA.domain)
rather than leave it as qmaster.domain.
If act_qmaster is allowed to be modified,
then this causes a problem for the schedd
because it reads hostA.domain as the
qmaster hostname and cannot connect.

Does anyone see anything suspect with what
I am doing under 6.0?

I have tried to remove write access to the
act_qmaster so that the contents do not change
but qmaster does not like it and quits.

Note the output from 'gethostname -aname' is
something like:


To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
For additional commands, e-mail: users-help at gridengine.sunsource.net

More information about the gridengine-users mailing list