[SGE-discuss] Ballooning qmaster memory usage followed by a crash.

Adam Tygart mozes at ksu.edu
Fri May 16 15:04:09 BST 2014


That seems to be a work around. Thanks.

--
Adam

On Fri, May 16, 2014 at 8:56 AM, John Foley <jfoley at motorola.com> wrote:
> Hi Adam - had the same thing happen to me yesterday (on UGEv8.1.7, but I'm
> guessing it might be the same thing)- -
>
> after you restart the qmaster, before it locks up, run a qconf -msconf --
> and check the line "schedd_job_info" - if it's set to "true", change it to
> "false" and restart qmaster.
>
> Give that a try and see if it helps.
>
>    John
>
>
>
>
>
> On Fri, May 16, 2014 at 8:50 AM, Adam Tygart <mozes at ksu.edu> wrote:
>>
>> Hello all,
>>
>> I'm running SoGE 8.1.6 compiled from source. This has been running
>> fine for a few months, however overnight the memory usage of the
>> qmaster ballooned from about 1GB to ~60GB before the qmaster dies. I
>> can restart it, but within 2-3 minutes it happens again.
>>
>> I've placed debug output from the scheduler and a qstat output, and a
>> sample from perf top here:
>> https://people.beocat.cis.ksu.edu/~mozes/sge/
>>
>> We routinely have more more jobs in the queue than we currently do, so
>> I am not sure what could be causing the issue.
>>
>> Anyone have any thoughts on what is happening? Is there any other
>> information you would need?
>>
>> Thanks,
>> Adam
>> _______________________________________________
>> SGE-discuss mailing list
>> SGE-discuss at liv.ac.uk
>> https://arc.liv.ac.uk/mailman/listinfo/sge-discuss
>
>


More information about the SGE-discuss mailing list