[GE users] Memory leak in 6.1u2 ? (plus weekly build binaries)

Andreas.Haas at Sun.COM Andreas.Haas at Sun.COM
Wed Jan 30 09:04:28 GMT 2008


    [ The following text is in the "ISO-8859-1" character set. ]
    [ Your display is set for the "ISO-8859-10" character set.  ]
    [ Some special characters may be displayed incorrectly. ]

Hi Henk,

On Tue, 29 Jan 2008, SLIM H.A. wrote:

> Hi Andreas
>
> Last Sunday night the scheduler consumed all available memory within a couple of 
> hours and we had to restart it. We run 6.1u2 and it seems similar to the problem 
> reported here.
> Is this supposed to be fixed in 6.1u3, issue
> 2187     6562190   memory leak in sge_schedd in
>
> http://gridengine.sunsource.net/project/gridengine/61patches.txt
>
> Is it worthwhile to upgrade?

Upgrading from 6.1u2 to 6.1u3 won't get you any improvement since u2 
does already contain the first part of the fix. For the second part 
you will need 6.1u4, but that is not yet released.

That means if you want the issue quickly resolved you can either build a 
scheduler binary from the latest V61_BRANCH sources or you federate with 
others community members and start a petition whose aim were to make the 
weekly build of the most recent V61_BRANCH sources available for download 
to anyone. Although the binaries of such a build would not go through any 
significant QA, but check-ins into V61_BRANCH anyways are handpicked and
under observation of many critical eyeballs after the peer-review.

Regards,
Andreas

>
> Thanks
>
> Henk
>
>> -----Original Message-----
>> From: Andreas.Haas at Sun.COM [mailto:Andreas.Haas at Sun.COM]
>> Sent: 15 January 2008 18:10
>> To: users at gridengine.sunsource.net
>> Subject: Re: [GE users] Memory leak in 6.1u2 ?
>>
>> Hi Richard,
>>
>> actually when I tested valgrind (version 3.2.3) with schedd I
>> could stop it with a simple qconf -ks.
>>
>> Could you file an issue to
>>
>>     http://gridengine.sunsource.net/servlets/ProjectIssues
>>
>> and describe in detail the setup and the case where you
>> observe the leak?
>> Please add sample configurations of queues, hosts, PE,
>> scheduler configuration and also the accounting.
>>
>> Regards,
>> Andreas
>>
>> On Tue, 15 Jan 2008, Richard Ems wrote:
>>
>>> Hi list,
>>>
>>> the memory leak is there again, and now I'm trying to use valgrind,
>>> but without success. 8( The problem seems to be, that I cannot make
>>> sge_schedd to end without killing it using SIGKILL (kill -9 ...),
>>> SIGTERM or qconf -ks don't stop the scheduler.
>>> And it's eating up to over 3 GB memory on this 4 GB memory
>> system and
>>> it does not end.
>>>
>>> After killing it with "kill -9", I get no more output that
>>>
>>> =============================================================
>>> # cat valgrind-sge_schedd-debug.out
>>> ==2897== Memcheck, a memory error detector.
>>> ==2897== Copyright (C) 2002-2007, and GNU GPL'd, by Julian
>> Seward et al.
>>> ==2897== Using LibVEX rev 1732, a library for dynamic
>> binary translation.
>>> ==2897== Copyright (C) 2004-2007, and GNU GPL'd, by OpenWorks LLP.
>>> ==2897== Using valgrind-3.2.3, a dynamic binary
>> instrumentation framework.
>>> ==2897== Copyright (C) 2000-2007, and GNU GPL'd, by Julian
>> Seward et al.
>>> ==2897== For more details, rerun with: -v ==2897== starting up GE
>>> 6.1u2 (lx24-amd64)
>>> =============================================================
>>>
>>> in the valgrind output file, since I am killing valgrind!
>>> Is there a way to stop the scheduler in another way without
>> having to
>>> kill valgrind ?
>>>
>>> If I don't kill it, at some point valgrind ends itself with
>>>
>>> =============================================================
>>> Valgrind's memory management: out of memory:
>>>   newSuperblock's request for 1048576 bytes failed.
>>>   5026193408 bytes have already been allocated.
>>> Valgrind cannot continue.  Sorry.
>>> =============================================================
>>>
>>> See appended files from 2 valgrind runs.
>>>
>>> regards, Richard
>>>
>>>
>>> --
>>> Richard Ems       mail: Richard.Ems at Cape-Horn-Eng.com
>>>
>>> Cape Horn Engineering S.L.
>>> C/ Dr. J.J. Dómine 1, 5? piso
>>> 46011 Valencia
>>> Tel : +34 96 3242923 / Fax 924
>>>
>>
>> <°)))><
>>
>> http://gridengine.info/
>>
>> Sitz der Gesellschaft: Sun Microsystems GmbH, Sonnenallee 1,
>> D-85551 Kirchheim-Heimstetten Amtsgericht Muenchen: HRB 161028
>> Geschaeftsfuehrer: Thomas Schroeder, Wolfgang Engels, Dr.
>> Roland Boemer Vorsitzender des Aufsichtsrates: Martin Haering
>>
>>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
> For additional commands, e-mail: users-help at gridengine.sunsource.net
>
>

http://gridengine.info/

Sitz der Gesellschaft: Sun Microsystems GmbH, Sonnenallee 1, D-85551 Kirchheim-Heimstetten
Amtsgericht Muenchen: HRB 161028
Geschaeftsfuehrer: Thomas Schroeder, Wolfgang Engels, Dr. Roland Boemer
Vorsitzender des Aufsichtsrates: Martin Haering



    [ Part 2: "Attached Text" ]

    [ The following text is in the "iso-8859-1" character set. ]
    [ Your display is set for the "ISO-8859-10" character set.  ]
    [ Some special characters may be displayed incorrectly. ]

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
For additional commands, e-mail: users-help at gridengine.sunsource.net



More information about the gridengine-users mailing list