[GE users] Qmaster hangs frequently (v6.2_u2)

parimi Venkateswara.Rao.Parimi at deshaw.com
Wed Jun 24 23:13:38 BST 2009


Qping output is convoluted, I think this is a known issue in v6.2,
there's a thread and bug report about this.

Monitoring time configured as below:

$ qconf -sconf | grep -A1 qmaster_params
qmaster_params               ENABLE_FORCED_QDEL=true
MONITOR_TIME=00:00:30 \
                             LOG_MONITOR_MESSAGE=false MAX_DYN_EC=200
$

We have 8 core box with 32G RAM, resources on the qmaster isn't a
problem. Qmaster was busy doing something but not loaded at all when it
becomes unresponsive to any client requests.

Any ideas why cleaning few jobs from job spool helps recovering the
qmaster? Qmaster restart or failover to a shadow master doesn't help
though.

Thanks, Parimi V.

------------------------------------------------------
http://gridengine.sunsource.net/ds/viewMessage.do?dsForumId=38&dsMessageId=203403

To unsubscribe from this discussion, e-mail: [users-unsubscribe at gridengine.sunsource.net].



More information about the gridengine-users mailing list