[SGE-discuss] Ballooning qmaster memory usage followed by a crash.

Adam Tygart mozes at ksu.edu
Fri May 16 14:50:17 BST 2014


Hello all,

I'm running SoGE 8.1.6 compiled from source. This has been running
fine for a few months, however overnight the memory usage of the
qmaster ballooned from about 1GB to ~60GB before the qmaster dies. I
can restart it, but within 2-3 minutes it happens again.

I've placed debug output from the scheduler and a qstat output, and a
sample from perf top here:
https://people.beocat.cis.ksu.edu/~mozes/sge/

We routinely have more more jobs in the queue than we currently do, so
I am not sure what could be causing the issue.

Anyone have any thoughts on what is happening? Is there any other
information you would need?

Thanks,
Adam


More information about the SGE-discuss mailing list