[GE users] execd behaviour in case of qmaster crash

ah_sunsource ahaupt at ifh.de
Thu Jun 11 07:59:39 BST 2009

Hi Rayson,

thanks for your reply.

On Thu, 2009-06-11 at 01:33 -0500, rayson wrote:
> You can try to manually migrate the master to another host, but the
> shadow master should automatically handle everything for you.

Well, I think the shadow master did everything correctly. It noticed the
breakdown of the real qmaster, started the qmaster process and modified

> http://gridengine.sunsource.net/howto/sge_migrate.html
> Is your $SGE_ROOT shared??

Yes. All the execd now see the shadow master host name in
$SGE_ROOT/$SGE_CELL/common/act_qmaster. But they don't care about it

I tested manual migration some time ago and this works perfectly. Looks
like the execd processes don't react correctly on the partly crashed
qmaster. But I'm not sure about this theory...


| Andreas Haupt             | E-Mail: andreas.haupt at desy.de
|  DESY Zeuthen             | WWW:    http://www-zeuthen.desy.de/~ahaupt
|  Platanenallee 6          | Phone:  +49/33762/7-7359
|  D-15738 Zeuthen          | Fax:    +49/33762/7-7216


To unsubscribe from this discussion, e-mail: [users-unsubscribe at gridengine.sunsource.net].

More information about the gridengine-users mailing list