[GE users] Sort by sequence number question
erik.lonroth at scania.com
Tue Jul 17 08:37:02 BST 2007
[ The following text is in the "utf-8" character set. ]
[ Your display is set for the "ISO-8859-10" character set. ]
[ Some characters may be displayed incorrectly. ]
Just when I started to get over this and return to my life, this issue
arises again. I didn't manage to solve this and I experience almost the
same as you do.
It seems the scheduler sorts things alphabetically/numerically also
within the cluster queue.
No matter how I modify the sequence number, my node: ts101-1-0 is
selected before ts101-1-1. The only way I could stop it was to remove
the PE resources from the specific nodes.
My goal is to get it like this. [<seqno>,<node>]
master.q - [99,ts101-1-0]
short.q - [1,ts101-1-0 0,ts101-1-1]
This way the node ts101-1-1 would be filled up before ts101-1-0 would be
- and that never happens as ts101-1-0 always fills up first (unless I
have $round_robin which is wrong for my application.)
I have banged my head into this so much and I KNOW there has to be
something wrong, somehow, somewhere...
I hope someone with more experience and knowledge will crack the nut.
On mån, 2007-07-16 at 14:31 -0300, Paul MacInnis wrote:
> On Wed, 11 Jul 2007, Paul MacInnis wrote:
> > On Wed, 11 Jul 2007, [iso-8859-1] Lnroth Erik wrote:
> > > yes, it is set. Still no luck on this.
> > >
> > > The only way I can force the damn slaves off the MASTER node, is to remove the requested "PE" explicitly from the nodes. This is not what I would want, but I just can't make it happen. It simply ignores my sequence number alltogether. I have recreated all queues and restarted the qmaster and scheduler, but no luck whatsoever.
> > >
> > > Is there something else affecting the effect of "sequence number" outside of the general queue configuration and the cluster config?
> > I would like to add our experience to this discussion.
> > We recently switched from SGE5.3 to 6.1. We have 1G, 2G and 4G nodes in
> > our cluster. If a job doesn't specify special memory requirements
> > we want it scheduled to the smallest memory machine available.
> > For 4 years with SGE5.3 this worked well. We assigned sequence number
> > 1965 to the 1G nodes, 2965 to the 2G nodes and 4965 to the 4G nodes.
> > With SGE6.1 we defined our queues with
> > seq_no 1965,[@2g.hg=2965],[@4g.hg=4965]
> > @2g.hg being the 2G host group nodes and @4g.hg being the 4G nodes.
> > With qconf -msconf we defined:
> > queue_sort_method seqno
> > qstat -F presents the queues correctly ordered by this seqno. However
> > jobs are being scheduled to 2G and 4G nodes when there are 1G nodes
> > available!
> > This never happened in SGE5.3!
> > It seems that in 6.1 either
> > 1. "queue_sort_method seqno" isn't working for queue selection or
> > 2. there is some other queue selection criteria that overrides
> > "queue_sort_method seqno"
> > Any thoughts?
> > Paul
> Here's what seems to be happening.
> For serial jobs we have 2 cluster queues: ser.q bg.q
> ser.q is the main serial queue; bg.q (priority 19) is meant to be used
> if only when the load on a node (load_avg and mem_used) is unexpectedly
> light. Generally same nodes are assigned to each cluster queue.
> seq.q uses seqno 1965, 2965 and 4965 for its 1G, 2G and 4G nodes.
> bg.q uses seqno 2969 and 4969 for its 2G and 4G nodes (no 1G nodes).
> The intention is that when a serial job appears nodes would be considered
> in this order:
> 1G seq.q, 2G seq.q, 2G bg.q, 4G seq.q 4G bg.q
> However what's happening seems to be this order:
> 1G seq.q, 2G bg.q, 4G bg.q, 2G seq.q, 4G seq.q
> It seems that for scheduling cluster queues are considered first in
> alphabetical order, and then only within the cluster queue queue
> instances are considered in seqno order!
> qstat however presents queue instances as intended - strictly by
> Is the solution to name our cluster queues to alphabetically match the
> order we wish them considered by the scheduler? Or is there some other
> setting that we've missed?
> To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
> For additional commands, e-mail: users-help at gridengine.sunsource.net
To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
For additional commands, e-mail: users-help at gridengine.sunsource.net
More information about the gridengine-users