[GE users] sge stopped: error: getting configuration:

Joachim Gabler Joachim.Gabler at Sun.COM
Tue May 3 15:02:51 BST 2005


    [ The following text is in the "ISO-8859-1" character set. ]
    [ Your display is set for the "ISO-8859-10" character set.  ]
    [ Some special characters may be displayed incorrectly. ]

Patrice,

what spooling method are you using (classic / berkeleydb)?

Please try to startup qmaster in debug mode:
In a shell as user root:
source $SGE_ROOT/util/dl.(c)sh
dl 1
$SGE_ROOT/bin/<arch>/sge_qmaster

This might show some error messages, e.g. when reading jobs from disk.

   Joachim

Daniel Templeton schrieb:

> Looks to me like running out of memory caused the qmaster to leave the 
> cluster in a broken state.  You'll need to clean up whatever the 
> qmaster left behind.  That may involve using utilbin/spooledit or 
> deleting jobs from the spool directory.  Unfortunately, you'll need 
> the advice of someone who actually recovers broken clusters instead of 
> just reinstalling them like I do.  Joachim?  Stephan?  Omar?
>
> Daniel
>
> Patrice Hamelin wrote:
>
>> After running sgemaster, qmaster is NOT running.  I run SGE 6.0u1 on 
>> RedHat linux 7.3.  see my other message, I had a memory problem which 
>> I think cause the  problem yesterday.
>>
>> Thanks guys for help!
>>
>> Daniel Templeton wrote:
>>
>>> After running sgemaster, is your qmaster running?  What platform and 
>>> SGE version?  Was there an event which caused the qmaster to stop?
>>>
>>> Daniel
>>>
>>> Patrice Hamelin wrote:
>>>
>>>> Hi,
>>>>
>>>>   My qmaster stopped since a couple of hours and I cannot restart it.
>>>> I always have:
>>>>
>>>> [root at stokes common]#  /etc/init.d/sgemaster start
>>>>    starting sge_qmaster
>>>>    starting sge_schedd
>>>> error: getting configuration: unable to contact qmaster using port 536
>>>> on host "stokes.clumeq.mcgill.ca"
>>>> can't get configuration from qmaster -- waiting ...
>>>>
>>>>
>>>>   thanks for help!
>>>
>>>
>>>
>>>
>>>
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
>>> For additional commands, e-mail: users-help at gridengine.sunsource.net
>>>
>>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
> For additional commands, e-mail: users-help at gridengine.sunsource.net
>

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
For additional commands, e-mail: users-help at gridengine.sunsource.net




More information about the gridengine-users mailing list