[GE users] problem with job distributions

mhanby mhanby at uab.edu
Tue Mar 10 19:54:14 GMT 2009

What does this show:

qstat -g c

-----Original Message-----
From: mad [mailto:margaret_Doll at brown.edu] 
Sent: Tuesday, March 10, 2009 2:36 PM
To: users at gridengine.sunsource.net
Subject: [GE users] problem with job distributions

I have compute nodes each of which have eight processors.  I have  
assigned eight compute nodes to one of my queues.  The compute nodes  
are listed as groupa   which is on the hostlist of group-a queue.  In  
the General Configuration for group-a queue, I have slots listed as   
8.  When I look at the Cluster Queues, queue group-a has 64 total  
slots.  Currently 52 slots are shown as
being  used in qmon.

However,  when I execute  "qstat -f | grep group-a, I get

group-a at compute-0-0.local       BIP   8/8       8.08     lx26-amd64
group-a at compute-0-1.local       BIP   8/8       8.06     lx26-amd64
group-a at compute-0-10.local      BIP   8/8       11.12    lx26-amd64
group-a at compute-0-11.local      BIP   2/8       10.22    lx26-amd64
group-a at compute-0-12.local      BIP   6/8       7.16     lx26-amd64
group-a at compute-0-13.local      BIP   4/8       4.78     lx26-amd64
group-a at compute-0-2.local       BIP   8/8       8.12     lx26-amd64
group-a at compute-0-3.local       BIP   8/8       15.13    lx26-amd64    a

Total number of slots being used is 52 which agrees with qmon.
However the load shows 59  jobs.

If I ssh  into compute-0-3, I see 15 jobs  being used by one user.
All jobs except one is using 50% of a CPU.

My users say they are using variations of

qsub -pe queue-a 20 scriptp

Why would the distibution of jobs be so out of whack?  I have been
running this cluster with this version of the system for about six
months  now.   The only time the distribution was not even
occurred before one of my users learned to use qsub properly.

Running ROCKS 5.3 with Redhat 2.6.18-53.1.14.el5


To unsubscribe from this discussion, e-mail:
[users-unsubscribe at gridengine.sunsource.net].


To unsubscribe from this discussion, e-mail: [users-unsubscribe at gridengine.sunsource.net].

More information about the gridengine-users mailing list