[GE users] sge stopped: error: getting configuration:

Patrice Hamelin phamelin at clumeq.mcgill.ca
Tue May 3 14:43:55 BST 2005

    [ The following text is in the "ISO-8859-1" character set. ]
    [ Your display is set for the "ISO-8859-10" character set.  ]
    [ Some special characters may be displayed incorrectly. ]

It seems that a not enough memory issue happend yesterday.
I rebooted the server this morning without any satisfying result

05/02/2005 16:52:13|qmaster|stokes|E|not enough memory to allocate 
888602 bytes in init_packbuffer
05/02/2005 16:52:13|qmaster|stokes|E|not enough memory for packing 
report: 888594 bytes
05/02/2005 16:52:16|qmaster|stokes|E|acknowledge timeout after 600 
seconds for event client (schedd:1) on host "stokes.clumeq.mcgill.ca"

The last entries in the message file are:

05/03/2005 09:14:40|qmaster|stokes|I|read job database with 15 entries 
in 0 seconds

McCalla, Mac wrote:
> Anything in your ....spool/qmaster/messages file?
> mac mccalla 
> -----Original Message-----
> From: Patrice Hamelin [mailto:phamelin at clumeq.mcgill.ca] 
> Sent: Tuesday, May 03, 2005 8:27 AM
> To: users at gridengine.sunsource.net
> Subject: [GE users] sge stopped: error: getting configuration:
> Hi,
>    My qmaster stopped since a couple of hours and I cannot restart it.
> I always have:
> [root at stokes common]#  /etc/init.d/sgemaster start
>     starting sge_qmaster
>     starting sge_schedd
> error: getting configuration: unable to contact qmaster using port 536
> on host "stokes.clumeq.mcgill.ca"
> can't get configuration from qmaster -- waiting ...
>    thanks for help!

Patrice Hamelin ing, M.Sc.A, CCNA
Systems Administrator
CLUMEQ Supercomputer Centre
McGill University
688 Sherbrooke Street West, Suite 710
Montreal, QC, Canada H3A 2S6
Tel: 514-398-3344
Fax: 514-398-2203

To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
For additional commands, e-mail: users-help at gridengine.sunsource.net

More information about the gridengine-users mailing list