[GE users] qmaster dying again....

Andreas.Haas at Sun.COM Andreas.Haas at Sun.COM
Wed Jul 18 14:27:59 BST 2007


Hi Iwona,

On Tue, 17 Jul 2007, Iwona Sakrejda wrote:

> Hi,
>
> A few days ago I upgraded from 6.0u4 to 6.0u11 and this morning my qmaster 
> started dying.

You did this as foreseen?

    http://gridengine.sunsource.net/install60patch.txt


> When I look at the logs I see messages:
>
> 7/17/2007 10:37:24|qmaster|pc2533|I|qmaster hard descriptor limit is set to 
> 8192
> 07/17/2007 10:37:24|qmaster|pc2533|I|qmaster soft descriptor limit is set to 
> 8192
> 07/17/2007 10:37:24|qmaster|pc2533|I|qmaster will use max. 8172 file 
> descriptors for communication
> 07/17/2007 10:37:24|qmaster|pc2533|I|qmaster will accept max. 99 dynamic 
> event clients

That is fine. It says qmaster got enough file descriptors available.

> Other than that nothing special.
>
> Also when I restart the qmaster I get messages:
> [root at pc2533 qmaster]# /etc/rc.d/init.d/sgemaster start
>  starting sge_qmaster
>  starting sge_schedd
> daemonize error: timeout while waiting for daemonize state

That means scheduler is having some problem during start-up. From 
the message one can not say what is causing the problems, but it 
could be due to qmaster in-turn having problems.

>  starting sge_shadowd
> error: getting configuration: failed receiving gdi request

Next indication for a crashed or sick qmaster.

>  starting up GE 6.0u11 (lx24-x86)
>
> How bad is any of that, could crashes be related to it?

Very likely.

> I am running on RHEL3 .

Have you tried some other OS?

Regards,
Andreas

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
For additional commands, e-mail: users-help at gridengine.sunsource.net




More information about the gridengine-users mailing list