[GE users] high CPU load for sge_qmaster
agrajag at dragaera.net
Fri Apr 29 17:26:42 BST 2005
[ The following text is in the "ISO-8859-1" character set. ]
[ Your display is set for the "ISO-8859-10" character set. ]
[ Some special characters may be displayed incorrectly. ]
Stephan Grell - Sun Germany - SSG - Software Engineer wrote:
> You are right, there are no jobs in the system. Could you monitor the
> qping output? Is the MT: allways that low?
> If there is nothing to do, I would except higher times than 0.4.
> When the system is idel, as yours are, the number should be similar to;:
> EDT:R(x) ~0.9
> TET:R(x) > 1
> MT:R(x) > 1
> Do you know what triggers this behavior?
> What operating system are you using?
I ran qping with '-i 10 -f' for a while. EDT seemed to bounce around,
always > 0.00 and < 1.00. TET bounced around, just as likely to be
above 1 as below it. And MT stayed at 0.04 the whole time. This system
is running CentOS 3, which is essentially RHEL3.
I have a much smaller test cluster running the same OS and the same SGE
binaries. Although at one point I spent a good amount of time trying to
reproduce this there, I've been unable to reproduce the problem on the
test cluster. I've tried all the configuration options I could think
of. The same qping command on that box tended to have a similar EDT to
my big cluster. The TET bounced around a bit, but was almost always
above 1. It had an MT that bounced around as well, but tended to stay
under 1 the whole time.
It looks like some jobs did exit on my big cluster while I was doing
this. I know for certain that no jobs were submitted or even running
on my test cluster during this.
I really have no idea what triggers this. My big cluster has been in
this state for most of a month. I tried to restart sge_qmaster a couple
of times to see if it would go away, but that never worked.
To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
For additional commands, e-mail: users-help at gridengine.sunsource.net
More information about the gridengine-users