[GE users] shadow master problems

rayson rayrayson at gmail.com
Fri Jul 16 17:05:25 BST 2010


As a test, can you restart the execd on the login2 and see if it can
pick up the new qmaster??

And does qstat work at all?? Just wondering if it is a name resolution
problem...

Rayson



On Fri, Jul 16, 2010 at 9:28 AM, murple <andreas.kuntzagk at mdc-berlin.de> wrote:
> Yes, the act_qmaster is updated to contain "login1".
>
> regards, Andreas
>
> rayson wrote:
>> Is the "act_qmaster" file updated to contain the new qmaster name??
>>
>> You should also reference the Migration of Qmaster to Another Machine
>> HOWTO and the Setting Up A Shadow Master In Grid Engine HOWTO:
>>
>> http://gridengine.sunsource.net/howto/sge_migrate.html
>> http://gridengine.sunsource.net/howto/shadow.html
>>
>> Rayson
>>
>>
>>
>> On Fri, Jul 16, 2010 at 8:50 AM, murple <andreas.kuntzagk at mdc-berlin.de> wrote:
>>> Hi,
>>>
>>> today we tested shadow master functionality on our cluster. After making
>>> $SGE_ROOT/$SGE_CELL/spool shared between the two master nodes (login1 and
>>> login2) stopping the qmaster on login2 (the old master) and removing the lock
>>> file first seemed to work. login1 noticed the stale heartbeat and started on own
>>> qmaster. Unfortunately the exec-nodes did not notice that change and still tried
>>> to contact login2.
>>> Error messages:
>>>
>>> main|node001|W|can't register at "qmaster": unable to contact qmaster using port
>>> 6444 on host "login2"
>>>
>>> Our setup is 6.2 with shared $SGE_ROOT/$SGE_CELL/common
>>> $SGE_ROOT/$SGE_CELL/spool is only shared between the login/head nodes.
>>>
>>> regards, Andreas
>>>
>>> ------------------------------------------------------
>>> http://gridengine.sunsource.net/ds/viewMessage.do?dsForumId=38&dsMessageId=268357
>>>
>>> To unsubscribe from this discussion, e-mail: [users-unsubscribe at gridengine.sunsource.net].
>>>
>>
>> ------------------------------------------------------
>> http://gridengine.sunsource.net/ds/viewMessage.do?dsForumId=38&dsMessageId=268363
>>
>> To unsubscribe from this discussion, e-mail: [users-unsubscribe at gridengine.sunsource.net].
>
> ------------------------------------------------------
> http://gridengine.sunsource.net/ds/viewMessage.do?dsForumId=38&dsMessageId=268367
>
> To unsubscribe from this discussion, e-mail: [users-unsubscribe at gridengine.sunsource.net].
>

------------------------------------------------------
http://gridengine.sunsource.net/ds/viewMessage.do?dsForumId=38&dsMessageId=268386

To unsubscribe from this discussion, e-mail: [users-unsubscribe at gridengine.sunsource.net].



More information about the gridengine-users mailing list