[SGE-discuss] Ballooning qmaster memory usage followed by a crash.

Adam Tygart mozes at ksu.edu
Fri May 16 14:50:17 BST 2014

Hello all,

I'm running SoGE 8.1.6 compiled from source. This has been running
fine for a few months, however overnight the memory usage of the
qmaster ballooned from about 1GB to ~60GB before the qmaster dies. I
can restart it, but within 2-3 minutes it happens again.

I've placed debug output from the scheduler and a qstat output, and a
sample from perf top here:

We routinely have more more jobs in the queue than we currently do, so
I am not sure what could be causing the issue.

Anyone have any thoughts on what is happening? Is there any other
information you would need?


More information about the SGE-discuss mailing list