[SGE-discuss] Ballooning qmaster memory usage followed by a crash.
mozes at ksu.edu
Fri May 16 14:50:17 BST 2014
I'm running SoGE 8.1.6 compiled from source. This has been running
fine for a few months, however overnight the memory usage of the
qmaster ballooned from about 1GB to ~60GB before the qmaster dies. I
can restart it, but within 2-3 minutes it happens again.
I've placed debug output from the scheduler and a qstat output, and a
sample from perf top here:
We routinely have more more jobs in the queue than we currently do, so
I am not sure what could be causing the issue.
Anyone have any thoughts on what is happening? Is there any other
information you would need?
More information about the SGE-discuss