[GE users] shadow master problems

rayson rayrayson at gmail.com
Fri Jul 16 14:16:31 BST 2010


Is the "act_qmaster" file updated to contain the new qmaster name??

You should also reference the Migration of Qmaster to Another Machine
HOWTO and the Setting Up A Shadow Master In Grid Engine HOWTO:

http://gridengine.sunsource.net/howto/sge_migrate.html
http://gridengine.sunsource.net/howto/shadow.html

Rayson



On Fri, Jul 16, 2010 at 8:50 AM, murple <andreas.kuntzagk at mdc-berlin.de> wrote:
> Hi,
>
> today we tested shadow master functionality on our cluster. After making
> $SGE_ROOT/$SGE_CELL/spool shared between the two master nodes (login1 and
> login2) stopping the qmaster on login2 (the old master) and removing the lock
> file first seemed to work. login1 noticed the stale heartbeat and started on own
> qmaster. Unfortunately the exec-nodes did not notice that change and still tried
> to contact login2.
> Error messages:
>
> main|node001|W|can't register at "qmaster": unable to contact qmaster using port
> 6444 on host "login2"
>
> Our setup is 6.2 with shared $SGE_ROOT/$SGE_CELL/common
> $SGE_ROOT/$SGE_CELL/spool is only shared between the login/head nodes.
>
> regards, Andreas
>
> ------------------------------------------------------
> http://gridengine.sunsource.net/ds/viewMessage.do?dsForumId=38&dsMessageId=268357
>
> To unsubscribe from this discussion, e-mail: [users-unsubscribe at gridengine.sunsource.net].
>

------------------------------------------------------
http://gridengine.sunsource.net/ds/viewMessage.do?dsForumId=38&dsMessageId=268363

To unsubscribe from this discussion, e-mail: [users-unsubscribe at gridengine.sunsource.net].



More information about the gridengine-users mailing list