[GE users] Odd behavior with act_qmaster - file contents change
dag at sonsorol.org
Wed Oct 22 16:37:39 BST 2008
The act_qmaster file lists the hostname of the currently running
qmaster system. The SGE clients read this file at startup to learn
what host to connect to. The only reason the hostname would change
automatically would be if you had configured shadow masters - in that
case if the qmaster can't be contacted within a timeout period, a new
qmaster starts up, reads the spool and then writes its hostname to the
If you are looking for docs at this I'd look for "shadow master".
Sounds like you did not intentionally set up or expect to see failover
If shadow master is not the culprit than the only other reason would
be that someone manually tried to start the Qmaster on a different
host -- youd' see the same symptoms then (new hostname in act_qmaster)
On Oct 22, 2008, at 11:27 AM, Brett W Grant wrote:
> I am running 6.1 on a cluster of macs. All but two of the macs are
> 10.4 Tiger OS, two are the 10.5.5 Leopard. At 7:16 local this
> morning, the file act_qmasters contents changed from the qmaster to
> one of these Leopard macs. In the spool/qmaster/messages file at
> 7:17 there is a message about a corrupted database detected, and
> then a DB_RUNRECOVERY message and then a number of messages where
> gethostbyname fails.
> If I look at the message file in the host that the was found in the
> self-modified act_qmaster file, it simply says at 7:20 that it
> couldn't connect to service.
> There was no longer a sgemaster process running on the original
> qmaster host.
> This system has been running just fine for over a year, however, I
> did add the two leopard clients about 1 month ago, but they have
> been working fine since then.
> I guess that I don't really understand what the act_qmaster file is
> for. I didn't see an entry in the Manual section. How could it
> change by itself? What should I do to prevent this from happening
> in the future? Where else can I look to see what happened? I
> didn't see anything at all in the system logs.
> Brett Grant
To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
For additional commands, e-mail: users-help at gridengine.sunsource.net
More information about the gridengine-users