[GE users] exechost over-subscription

shawn174 shawn.stephens at gmail.com
Fri Jan 8 16:02:34 GMT 2010


We've got a problem running multiple queues on compute nodes where the
machines are being over-subscribed in terms of slots.  I found the
complex_values slots=X, and implemented it on all of our compute
nodes, but we still see over-subscription.

qconf -aattr exechost complex_values slots=4 <hostname>

For each exec host, all.q has 4 slots and harpertown.q has 4 slots for
Harpertowns, and nehalem.q has 6 slots for Nehalems.  What we're
occasionally seeing is 7 slots (or more) being used on an exec host,
even though we have the above complex_value set to 4 or 6.

Is there any other way we can limit the number of slots on a node
without reducing the number of slots assigned by the queues?  We like
the functionality of being able to have a catch-all all.q, while being
able to specifically use nodes assigned to a queue (Harpertown vs.
Nehalem).  We have all of the nodes in all.q, while Harpertown nodes
are in harpertown.q and Nehalem nodes are in nehalem.q We're running
SGE 6.2u1 on Rocks 4.3 x86_64 nodes.

Thanks in advance,
Shawn


-- 
Shawn Stephens

------------------------------------------------------
http://gridengine.sunsource.net/ds/viewMessage.do?dsForumId=38&dsMessageId=237401

To unsubscribe from this discussion, e-mail: [users-unsubscribe at gridengine.sunsource.net].



More information about the gridengine-users mailing list