[GE users] execd behaviour in case of qmaster crash

ah_sunsource ahaupt at ifh.de
Thu Jun 11 08:57:50 BST 2009


Hi Rayson,

On Thu, 2009-06-11 at 02:10 -0500, rayson wrote:
> Hi,
> 
> So what is the exact state of the master when this happens?? Is the
> machine up but the qmaster process dead??

The host is still up. But I cannot login any more. The qmaster process
still seems to run  - also the tcp socket is still reachable:

[hpbl1] ~ # telnet lolek-vm1 sge_qmaster
Trying 141.34.32.95...
Connected to lolek-vm1.
Escape character is '^]'.
^]
telnet> quit
[hpbl1] ~ # getent services sge_qmaster
sge_qmaster           538/tcp
[hpbl1] ~ # qping -info lolek-vm1 538 qmaster 1
endpoint lolek-vm1.ifh.de/qmaster/1 at port 538: can't find connection

Any communication simply hangs.

Cheers,
Andreas

-- 
| Andreas Haupt             | E-Mail: andreas.haupt at desy.de
|  DESY Zeuthen             | WWW:    http://www-zeuthen.desy.de/~ahaupt
|  Platanenallee 6          | Phone:  +49/33762/7-7359
|  D-15738 Zeuthen          | Fax:    +49/33762/7-7216

------------------------------------------------------
http://gridengine.sunsource.net/ds/viewMessage.do?dsForumId=38&dsMessageId=201516

To unsubscribe from this discussion, e-mail: [users-unsubscribe at gridengine.sunsource.net].



More information about the gridengine-users mailing list