[SGE-discuss] Ballooning qmaster memory usage followed by a crash.

John Foley jfoley at motorola.com
Fri May 16 14:56:24 BST 2014


Hi Adam - had the same thing happen to me yesterday (on UGEv8.1.7, but I'm
guessing it might be the same thing)- -

after you restart the qmaster, before it locks up, run a qconf -msconf --
and check the line "schedd_job_info" - if it's set to "true", change it to
"false" and restart qmaster.

Give that a try and see if it helps.

   John





On Fri, May 16, 2014 at 8:50 AM, Adam Tygart <mozes at ksu.edu> wrote:

> Hello all,
>
> I'm running SoGE 8.1.6 compiled from source. This has been running
> fine for a few months, however overnight the memory usage of the
> qmaster ballooned from about 1GB to ~60GB before the qmaster dies. I
> can restart it, but within 2-3 minutes it happens again.
>
> I've placed debug output from the scheduler and a qstat output, and a
> sample from perf top here:
> https://people.beocat.cis.ksu.edu/~mozes/sge/
>
> We routinely have more more jobs in the queue than we currently do, so
> I am not sure what could be causing the issue.
>
> Anyone have any thoughts on what is happening? Is there any other
> information you would need?
>
> Thanks,
> Adam
> _______________________________________________
> SGE-discuss mailing list
> SGE-discuss at liv.ac.uk
> https://arc.liv.ac.uk/mailman/listinfo/sge-discuss
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://arc.liv.ac.uk/pipermail/sge-discuss/attachments/20140516/972f8b97/attachment.html>


More information about the SGE-discuss mailing list