[GE users] sge queues stomping on each other
emorris at pmc.ucsc.edu
Fri Oct 1 07:46:24 BST 2010
I tried to set up a couple of new queues besides the default 'all.q' on my small cluster. I cloned the default queue in qmon and then made one of the queue subordinate to the other, so that jobs on the lower priority queue will be suspended for jobs in the higher priority queue when the cluster does not have enough processors to run all the jobs submitted. It's a small group and we just need a simple scheme. Here's the problem; the jobs from one queue now try to run on the same node / processors as the jobs from the other queue such that one computer node will be loaded up with 16 processors worth of jobs, even if the node only has 8 processors, while some nodes go totally unused. So, it looks like one queue isn't 'aware' of the other and they are both trying to use the same processors, instead of knowing that one queue has jobs on node X, so it will use node Y. Does anyone know how to deal with this? This is my first exposure to messing with scheduling and sge is a beast to try to understand at first. I'd appreciate any help.
Some of the Rocks Cluster mailing list recommended I try this:
qconf -mattr exechost complex_values slots=8 compute-0-0
for each node and I did that, but I'm still getting the same problem.
Thanks very much,
To unsubscribe from this discussion, e-mail: [users-unsubscribe at gridengine.sunsource.net].
More information about the gridengine-users