[GE users] Memory Leak in schedd 6.1u4?

Andreas.Haas at Sun.COM Andreas.Haas at Sun.COM
Mon Apr 14 09:19:28 BST 2008


Have you tried using coreadm(1)

    http://docs.sun.com/app/docs/doc/805-7229/6j6q8svhr?l=en&a=view&q=setuid

to ensure you get a core file?

Note, as under Linux, core files are not written under Solaris for processes 
like sge_schedd as it changes it's uid. So it is necessary to enable it 
manually.

Regards,
Andreas

On Sun, 13 Apr 2008, Mulley, Nikhil wrote:

> I encounter schedd crash under sol-amd64 platform.
>
> -----Original Message-----
> From: Andreas.Haas at Sun.COM [mailto:Andreas.Haas at Sun.COM]
> Sent: Friday, April 11, 2008 9:03 PM
> To: users at gridengine.sunsource.net
> Subject: RE: [GE users] Memory Leak in schedd 6.1u4?
>
> Hi Mulley,
>
> since you encounter schedd crashes under Linux I would assume
> applying libcore.so will lead you quicker to a result
>
>    http://gridengine.sunsource.net/issues/show_bug.cgi?id=2552
>
> besides I second Roland in his advise to switch schedd_job_info off.
>
> Best regards,
> Andreas
>
> On Wed, 9 Apr 2008, Mulley, Nikhil wrote:
>
>> Thanks Roland. I have changed the scheduler parameter
> 'schedd_job_info'
>> now and will observe if the scheduler crashes again.
>>
>> Nikhil
>>
>> -----Original Message-----
>> From: Roland.Dittel at Sun.COM [mailto:Roland.Dittel at Sun.COM]
>> Sent: Wednesday, April 09, 2008 9:10 PM
>> To: users at gridengine.sunsource.net
>> Subject: Re: [GE users] Memory Leak in schedd 6.1u4?
>>
>> Hi Nikhil,
>>
>> Mulley, Nikhil wrote:
>>> Roland, wondering if this parameter change in the scheduler
>>> configuration would ask for any scheduler(qmaster+scheduler) restart?
>>
>> No, a restart is not necessary. The change gets propagated to the
>> scheduler and has immediate effect.
>>
>> Best regards
>> Roland
>>
>>>
>>> -----Original Message-----
>>> From: Roland.Dittel at Sun.COM [mailto:Roland.Dittel at Sun.COM]
>>> Sent: Wednesday, April 09, 2008 7:26 PM
>>> To: users at gridengine.sunsource.net
>>> Subject: Re: [GE users] Memory Leak in schedd 6.1u4?
>>>
>>> Nikhil,
>>>
>>> Andreas and me believe the culprit is schedd_job_info in the
> scheduler
>>
>>> config. Can you please set it to false and give it a try?
>>>
>>> Best regards
>>> Roland
>>>
>>> Mulley, Nikhil wrote:
>>>> Andreas, while you are with Linux binary, could you please provide
> me
>>> a
>>>> sol-amd64 binary with mallinfo/or whatever logging on Solaris
>>> platform?
>>>> I can chip in to provide any other necessary information.
>>>>
>>>> Thanks,
>>>> Nikhil
>>>>
>>>> -----Original Message-----
>>>> From: Andreas.Haas at Sun.COM [mailto:Andreas.Haas at Sun.COM]
>>>> Sent: Monday, April 07, 2008 12:51 PM
>>>> To: users at gridengine.sunsource.net
>>>> Subject: Re: [GE users] Memory Leak in schedd 6.1u4?
>>>>
>>>> Hi Brian,
>>>>
>>>> from your mentioning of
>>>>
>>>>     http://linux-mm.org/OOM_Killer
>>>>
>>>> I conclude you run into this under a Linux distribution. If it is
>>>> RHEL4/amd64
>>>> I have a ready-built binary that comes with mallinfo(3) logging and
>>> I'm
>>>> confident
>>>> this will us help to find the evildoer.
>>>>
>>>> Regards,
>>>> Andreas
>>>>
>>>>
>>>> On Sun, 6 Apr 2008, Brian Smith wrote:
>>>>
>>>>> Just restarted schedd with valgrind... I should have output the
> next
>>>> time the
>>>>> oom_killer kicks in. Are the "retail" binaries built with debugging
>>>> symbols
>>>>> or will I need to build my own schedd?
>>>>>
>>>>> -Brian
>>>>>
>>>>> Chris Dagdigian wrote:
>>>>>> Hi Brian,
>>>>>>
>>>>>> People have been reporting schedd leaks in the 6.1u series - some
>>>> problems
>>>>>> were found and fixed but there are indications on the user list
>> that
>>>> the
>>>>>> problem still remains. Multiple people have been trying to track
>>> down
>>>> the
>>>>>> cause -- any additional eyeballs and details will undoubtably be
>>>> welcome,
>>>>>> especially if you can run under valgrind or other tools that may
>>> help
>>>> trace
>>>>>> down the offending code.
>>>>>>
>>>>>> -Chris
>>>>>>
>>>>>>
>>>>>> On Apr 6, 2008, at 1:11 PM, Brian Smith wrote:
>>>>>>> Has anyone else noticed a memory leak with 6.1u4? oom-killer is
>>>> stopping
>>>>>>> my sge_schedd because its wolfing down gobs of ram. The box I'm
>>>> running on
>>>>>>> has 4GB and is used only for nis, qmaster/schedd, and managing
> and
>>
>>>>>>> provisioning the cluster. I'll post more details if desired.
>>>>>>>
>>>>>>>
>>>>>>> Thanks,
>>>>>>> -Brian
>>>>>>
>>> ---------------------------------------------------------------------
>>>>>> To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
>>>>>> For additional commands, e-mail:
>> users-help at gridengine.sunsource.net
>>>>>
>> ---------------------------------------------------------------------
>>>>> To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
>>>>> For additional commands, e-mail:
> users-help at gridengine.sunsource.net
>>>>>
>>>>>
>>>> http://gridengine.info/
>>>>
>>>> Sitz der Gesellschaft: Sun Microsystems GmbH, Sonnenallee 1, D-85551
>>>> Kirchheim-Heimstetten
>>>> Amtsgericht Muenchen: HRB 161028
>>>> Geschaeftsfuehrer: Thomas Schroeder, Wolfgang Engels, Dr. Roland
>>> Boemer
>>>> Vorsitzender des Aufsichtsrates: Martin Haering
>>>>
>>>>
> ---------------------------------------------------------------------
>>>> To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
>>>> For additional commands, e-mail: users-help at gridengine.sunsource.net
>>>>
>>>>
>>>>
> ---------------------------------------------------------------------
>>>> To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
>>>> For additional commands, e-mail: users-help at gridengine.sunsource.net
>>>>
>>>
>>>
>>
>>
>> --
>> - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
>> Roland Dittel               Tel: +49 (0)941 3075-275 (x60275)
>> Software Engineering        Fax: +49 (0)941 3075-222 (x60222)
>> Sun Microsystems GmbH
>> Dr.-Leo-Ritter-Str. 7       mailto:roland.dittel at sun.com
>> D-93049 Regensburg          http://www.sun.com/gridware
>> - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
>> Registered Office / Sitz der Gesellschaft:
>>   Sun Microsystems GmbH
>>   Sonnenallee 1
>>   D-85551 Kirchheim-Heimstetten
>>   Germany
>> Commercial register of the Local Court of Munich /
>> Handelsregistereintrag Amtsgericht Muenchen:
>>   HRB 161028
>> Managing Directors / Geschaeftsfuehrer:
>>   Thomas Schroeder, Wolfgang Engels, Dr. Roland Boemer
>> Chairman of the Supervisory Board / Vorsitzender des Aufsichtsrates
>>   Martin Haering
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
>> For additional commands, e-mail: users-help at gridengine.sunsource.net
>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
>> For additional commands, e-mail: users-help at gridengine.sunsource.net
>>
>>
>
> http://gridengine.info/
>
> Sitz der Gesellschaft: Sun Microsystems GmbH, Sonnenallee 1, D-85551
> Kirchheim-Heimstetten
> Amtsgericht Muenchen: HRB 161028
> Geschaeftsfuehrer: Thomas Schroeder, Wolfgang Engels, Dr. Roland Boemer
> Vorsitzender des Aufsichtsrates: Martin Haering
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
> For additional commands, e-mail: users-help at gridengine.sunsource.net
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
> For additional commands, e-mail: users-help at gridengine.sunsource.net
>
>

http://gridengine.info/

Sitz der Gesellschaft: Sun Microsystems GmbH, Sonnenallee 1, D-85551 Kirchheim-Heimstetten
Amtsgericht Muenchen: HRB 161028
Geschaeftsfuehrer: Thomas Schroeder, Wolfgang Engels, Dr. Roland Boemer
Vorsitzender des Aufsichtsrates: Martin Haering

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
For additional commands, e-mail: users-help at gridengine.sunsource.net




More information about the gridengine-users mailing list