[GE users] shadow master problems

dmikh Dmitry.Mikhailichenko at Sun.COM
Fri Jul 16 17:20:53 BST 2010


I remember one similar bug but it was fixed in 6.2 update 1. Which 
update of sge do you use?


Thanks,
Dmitry

16.07.10 16:50, murple wrote:
> Hi,
>
> today we tested shadow master functionality on our cluster. After making 
> $SGE_ROOT/$SGE_CELL/spool shared between the two master nodes (login1 and 
> login2) stopping the qmaster on login2 (the old master) and removing the lock 
> file first seemed to work. login1 noticed the stale heartbeat and started on own 
> qmaster. Unfortunately the exec-nodes did not notice that change and still tried 
> to contact login2.
> Error messages:
>
> main|node001|W|can't register at "qmaster": unable to contact qmaster using port 
> 6444 on host "login2"
>
> Our setup is 6.2 with shared $SGE_ROOT/$SGE_CELL/common
> $SGE_ROOT/$SGE_CELL/spool is only shared between the login/head nodes.
>
> regards, Andreas
>
> ------------------------------------------------------
> http://gridengine.sunsource.net/ds/viewMessage.do?dsForumId=38&dsMessageId=268357
>
> To unsubscribe from this discussion, e-mail: [users-unsubscribe at gridengine.sunsource.net].
>
>

------------------------------------------------------
http://gridengine.sunsource.net/ds/viewMessage.do?dsForumId=38&dsMessageId=268389

To unsubscribe from this discussion, e-mail: [users-unsubscribe at gridengine.sunsource.net].



More information about the gridengine-users mailing list