[GE users] Shadow master

heywood heywood at cshl.edu
Fri May 7 14:37:30 BST 2010


    [ The following text is in the "utf-8" character set. ]
    [ Your display is set for the "ISO-8859-10" character set.  ]
    [ Some characters may be displayed incorrectly. ]

No, We?re using classic spooling. Failover to a shadow master worked with
classic spooling on an earlier 6.2 (maybe 6.1) version.

I did see ?For using a shadow master it is recommended to set up a
Berkeley DB Spooling Server", but that doesn't tell me that BDB is
*required*.

Are you sure it is required? When did that change?

Architecture is the same and the settings.sh is sourced.

Thanks,

Todd



On 5/7/10 6:30 AM, "tmacmd" <tmacmd at gmail.com> wrote:

> Are you using a RPC-based spooling DB server or is your BDB over NFSv4?
> Only those two methods allows for the use of a shadow.
> Did you source your settings.sh file?
> Is this a different arch than what you have installed?
> (look at uname -a and then look in the util dir to verify)
> 
> --tmac
>          Tim McCarthy
>      Principal Consultant
> 
>   RedHat Certified Engineer
>    804006984323821 (RHEL4)
>    805007643429572 (RHEL5)
> 
> 
> On Thu, May 6, 2010 at 2:39 PM, heywood <heywood at cshl.edu> wrote:
>> It used to be that installing the shadow master just involved putting the
>> hostname of the machine that will run the shadow master in
>> /opt/sge/default/common/shadow_masters, and then starting the shadow master
>> on that node with "/opt/sge/default/common/sgemaster -shadowd". A few SGE
>> versions ago I tested failover and it was fine. sge_qmaster runs on the main
>> head node and sge_shadowd runs on the shadow/spare head node.
>> 
>> Earlier this week the main head node was rebooted, and it appeared that
>> failover worked since the act_qmaster file was updated to hold the
>> shadow/spare node name. But SGE commands got the error that they couldn't
>> find the connection for the qmaster port. So did qping.
>> 
>> So I looked at the 6.2u5 docs, and they now say to "install" the shadow
>> master with "./inst_sge -sm". OK, maybe something changed since the shadow
>> master failover worked for us in an earlier version. But trying that, I get:
>> 
>> Creating local configuration
>> ----------------------------
>> value == NULL for attribute "mailer" in configuration list of "bhmnode1"
>> 
>> ./util/install_modules/inst_common.sh: line 261: Translate: command not
>> found
>> 
>> ./util/install_modules/inst_common.sh: line 263: Translate: command not
>> found
>> ./util/install_modules/inst_common.sh: line 264: Translate: command not
>> found
>> 
>> 
>> So... How do I get shadow master failover working again?
>> 
>> Todd
>> 
>> ------------------------------------------------------
>> http://gridengine.sunsource.net/ds/viewMessage.do?dsForumId=38&dsMessageId=25
>> 6429 
>> <http://gridengine.sunsource.net/ds/viewMessage.do?dsForumId=38&dsMessage
>> Id=256429> 
>> 
>> To unsubscribe from this discussion, e-mail:
>> [users-unsubscribe at gridengine.sunsource.net].
> 
>

------------------------------------------------------
http://gridengine.sunsource.net/ds/viewMessage.do?dsForumId=38&dsMessageId=256535

To unsubscribe from this discussion, e-mail: [users-unsubscribe at gridengine.sunsource.net].



More information about the gridengine-users mailing list