[GE users] Queue sorting considerations

brs brs at usf.edu
Wed Sep 23 21:24:04 BST 2009


I'm building out an multi-tiered fat tree IB topology where we have a number of 24 port switches connected to a larger core switch.  Each switch will have 18 nodes connected to it with 6 ISLs going to the core.

I'd like to be able to schedule jobs to favor spanning contiguous blocks of nodes that sit on the same switch to reduce the effects of blocking and inter-switch latency on our jobs.

A simple example follows: Assume that q1 and q2 each have 10 nodes, each node with 8 slots.

1. If I submit a 60 slot job to an empty cluster, it should run only on nodes associated with q1
2. Afterward, I submit a 40 slot job.  It should run only on nodes associated with q2.
3. If I submit another 60 slot job, it should run on both q1 and q2, filling them both to capacity.

In this scenario, the job prefers queues that can fully satisfy its resource requests.  If no queues are found that can fully satisfy the request, we can then span the job across multiple queues.

A load_formula that looks like this should be able to do it:

((queue_free_slots/req_slots)*<big_weight>) + ((other load considerations)*<little_weight>)

Now, would these values actually be available for such a purpose (or can they be made available):

queue_free_slots: total free slots for a given queue (not queue instance)
req_slots: slots requested by a job

Is this even possible?  Is there a better way?

Thanks,

Brian Smith
Sr. HPC Systems Administrator
IT Research Computing, University of South Florida
4202 E. Fowler Ave. ENB308 
Office Phone: +1 813 974-1467
Organization URL: http://rc.usf.edu

------------------------------------------------------
http://gridengine.sunsource.net/ds/viewMessage.do?dsForumId=38&dsMessageId=218734

To unsubscribe from this discussion, e-mail: [users-unsubscribe at gridengine.sunsource.net].



More information about the gridengine-users mailing list