[GE users] Sort by sequence number question

Paul MacInnis macinnis at dal.ca
Tue Jul 17 16:04:29 BST 2007


Hi Erik,

On Tue, 17 Jul 2007, Erik [ISO-8859-1] L?nroth wrote:

> 
> 
> On tis, 2007-07-17 at 16:12 +0200, Andreas.Haas at Sun.COM wrote:
> > Hi Paul,
> > 
> > On Mon, 16 Jul 2007, Paul MacInnis wrote:
> > 
> > > On Wed, 11 Jul 2007, Paul MacInnis wrote:
> > >
> > >> On Wed, 11 Jul 2007, [iso-8859-1] L??nroth Erik wrote:
> > >>
> > >>> yes, it is set. Still no luck on this.
> > >>>
> > >>> The only way I can force the damn slaves off the MASTER node, is to remove the requested "PE" explicitly from the nodes. This is not what I would want, but I just can't make it happen. It simply ignores my sequence number alltogether. I have recreated all queues and restarted the qmaster and scheduler, but no luck whatsoever.
> > >>>
> > >>> Is there something else affecting the effect of "sequence number" outside of the general queue configuration and the cluster config?
> > >>
> > >> I would like to add our experience to this discussion.
> > >>
> > >> We recently switched from SGE5.3 to 6.1.  We have 1G, 2G and 4G nodes in
> > >> our cluster.  If a job doesn't specify special memory requirements
> > >> we want it scheduled to the smallest memory machine available.
> > >>
> > >> For 4 years with SGE5.3 this worked well.  We assigned sequence number
> > >> 1965 to the 1G nodes, 2965 to the 2G nodes and 4965 to the 4G nodes.
> > >>
> > >> With SGE6.1 we defined our queues with
> > >> seq_no  1965,[@2g.hg=2965],[@4g.hg=4965]
> > >>
> > >> @2g.hg being the 2G host group nodes and @4g.hg being the 4G nodes.
> > >>
> > >> With qconf -msconf we defined:
> > >> queue_sort_method     seqno
> > >>
> > >> qstat -F presents the queues correctly ordered by this seqno.  However
> > >> jobs are being scheduled to 2G and 4G nodes when there are 1G nodes
> > >> available!
> > >>
> > >> This never happened in SGE5.3!
> > >>
> > >> It seems that in 6.1 either
> > >> 1. "queue_sort_method  seqno" isn't working for queue selection or
> > >> 2. there is some other queue selection criteria that overrides
> > >>    "queue_sort_method  seqno"
> > >>
> > >> Any thoughts?
> > >>
> > >> Paul
> > >
> > > Here's what seems to be happening.
> > >
> > > For serial jobs we have 2 cluster queues:  ser.q    bg.q
> > >
> > > ser.q is the main serial queue; bg.q (priority 19) is meant to be used
> > > if only when the load on a node (load_avg and mem_used) is unexpectedly
> > > light.  Generally same nodes are assigned to each cluster queue.
> > >
> > > seq.q uses seqno 1965, 2965 and 4965 for its 1G, 2G and 4G nodes.
> > >
> > > bg.q uses seqno 2969 and 4969 for its 2G and 4G nodes (no 1G nodes).
> > >
> > > The intention is that when a serial job appears nodes would be considered
> > > in this order:
> > >
> > > 1G seq.q,  2G seq.q,  2G bg.q,  4G seq.q  4G bg.q
> > >
> > > However what's happening seems to be this order:
> > >
> > > 1G seq.q,  2G bg.q,  4G bg.q,  2G seq.q,  4G seq.q
> > >
> > > It seems that for scheduling cluster queues are considered first in
> > > alphabetical order, and then only within the cluster queue queue
> > > instances are considered in seqno order!
> > >
> > > qstat however presents queue instances as intended - strictly by
> > > seqno.
> > >
> > > Is the solution to name our cluster queues to alphabetically match the
> > > order we wish them considered by the scheduler?  Or is there some other
> > > setting that we've missed?
> > 
> > I can not reproduce this. Here is my queue set-up:
> > 
> >     > qconf -ssconf | grep sort
> >     queue_sort_method                 seqno
> > 
> >     > qconf -shgrp @oneG
> >     group_name @oneG
> >     hostlist angbor
> > 
> >     > qconf -shgrp @twoG
> >     group_name @twoG
> >     hostlist es-ergb01-01
> > 
> >     > qconf -shgrp @fourG
> >     group_name @fourG
> >     hostlist baumbart
> > 
> >     > qconf -sq test_ser.q | egrep "hostlist|seq|load_thre|slots"
> >     hostlist              @oneG @twoG @fourG
> >     seq_no                0,[@oneG=1965],[@twoG=2965],[@fourG=4965]
> >     load_thresholds       NONE
> >     slots                 1
> > 
> >     > qconf -sq test_bg.q | egrep "hostlist|seq|load_thre|slots"
> >     hostlist              @twoG @fourG
> >     seq_no                0,[@twoG=2969],[@fourG=4969]
> >     load_thresholds       NONE
> >     slots                 1
> > 
> > when I submit
> > 
> >     > qsub -t 1-5 -q 'test_*' -b y /sleep 5
> >     Your job-array 528.1-5:1 ("sleep") has been submitted
> > 
> > I get queues filled in the order of the array task indices
> > 
> >     > qstat -f -q 'test_*'
> >     queuename                      qtype used/tot. load_avg arch          states
> >     ----------------------------------------------------------------------------
> >     test_ser.q at angbor              BIP   1/1       0.04     lx24-x86
> >         528 0.55500 sleep      ah114088     t     07/17/2007 16:02:19     1 1
> >     ----------------------------------------------------------------------------
> >     test_ser.q at es-ergb01-01        BIP   1/1       0.42     sol-sparc64
> >         528 0.55500 sleep      ah114088     t     07/17/2007 16:02:19     1 2
> >     ----------------------------------------------------------------------------
> >     test_bg.q at es-ergb01-01         BIP   1/1       0.42     sol-sparc64
> >         528 0.55500 sleep      ah114088     t     07/17/2007 16:02:19     1 3
> >     ----------------------------------------------------------------------------
> >     test_ser.q at baumbart            BIP   1/1       0.19     irix65
> >         528 0.55500 sleep      ah114088     t     07/17/2007 16:02:19     1 4
> >     ----------------------------------------------------------------------------
> >     test_bg.q at baumbart             BIP   1/1       0.19     irix65
> >         528 0.55500 sleep      ah114088     t     07/17/2007 16:02:19     1 5
> > 
> > and the same is true with plain sequential jobs
> > 
> >     > ntimes 5 qsub -q 'test_*' -b y /bin/sleep 5
> >     Your job 534 ("sleep") has been submitted
> >     Your job 535 ("sleep") has been submitted
> >     Your job 536 ("sleep") has been submitted
> >     Your job 537 ("sleep") has been submitted
> >     Your job 538 ("sleep") has been submitted
> > 
> >     > qstat -f -q 'test_*'
> >     queuename                      qtype used/tot. load_avg arch          states
> >     ----------------------------------------------------------------------------
> >     test_ser.q at angbor              BIP   1/1       0.10     lx24-x86
> >         534 0.55500 sleep      ah114088     r     07/17/2007 16:07:09     1
> >     ----------------------------------------------------------------------------
> >     test_ser.q at es-ergb01-01        BIP   1/1       0.34     sol-sparc64
> >         535 0.55500 sleep      ah114088     r     07/17/2007 16:07:09     1
> >     ----------------------------------------------------------------------------
> >     test_bg.q at es-ergb01-01         BIP   1/1       0.34     sol-sparc64
> >         536 0.55500 sleep      ah114088     t     07/17/2007 16:07:09     1
> >     ----------------------------------------------------------------------------
> >     test_ser.q at baumbart            BIP   1/1       0.19     irix65
> >         537 0.55500 sleep      ah114088     t     07/17/2007 16:07:09     1
> >     ----------------------------------------------------------------------------
> >     test_bg.q at baumbart             BIP   1/1       0.19     irix65
> >         538 0.55500 sleep      ah114088     t     07/17/2007 16:07:09     1
> > 
> > I did this with N1GE 6.1
> > 
> > Could it be that jobs are submitted with -soft option as to specify some 
> > preferece? Or are you using some over-sensitive load thresholds?
> > 
> > Regards,
> > Andreas
> > 
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: users-unsubscribe at gridengine..sunsource.net
> > For additional commands, e-mail: users-help at gridengine.sunsource.net
> 
> 
> My identical outputs (as they apply, I dont have that many HG:s):
> 
> 
> # sort method
> qconf -ssconf | grep sort
> queue_sort_method                 seqno
> 
> 
> # my only HG here
> qconf -shgrp @ts101_X_hg
> group_name @ts101_X_hg
> hostlist ts101-1-0.sss.se.scania.com ts101-1-1.sss.se.scania.com
> 
> # show the only master queue 
> qconf -sq master.101.q | egrep "hostlist|seq|load_thre|slots"
> hostlist              ts101-1-0.sss.se.scania.com
> seq_no                1019
> load_thresholds       np_load_avg=1.75
> slots                 1
> 
> # show the "slave"/"short" queue.
> qconf -sq short.101.q | egrep "hostlist|seq|load_thre|slots"
> hostlist              @ts101_X_hg
> seq_no                101,[ts101-1-1.sss.se.scania.com=0], \
> load_thresholds       np_load_avg=1.75
> slots                 4
> 
> (See the "\" charachter last on the line on "seq_no" ? - Dont know it
> that might be a problem?)
> 
> Now, this is the complete job (slot-allocation.job):
> 
> #!/bin/bash
> #$ -S /bin/bash
> #$ -N slot-allocation
> #$ -cwd 
> #$ -o output.$JOB_ID
> #$ -e errors.$JOB_ID
> #$ -pe powerflow_*_pe 5
> #$ -masterq master.*.q
> echo "Starting on: ${HOSTNAME}"
> echo "$PE_HOSTFILE contains:"
> cat $PE_HOSTFILE
> sleep 30
> 
> This is the submit:
> 
> qsub slot-allocation.job
> 
> 
> And - the tragic output from qstat:
> 
> qstat -t
> 
>     272 0.55500 slot-alloc sssler       r     07/17/2007 16:39:01
> short.101.q at ts101-1-0.sss.se.s SLAVE         
> 
> short.101.q at ts101-1-0.sss.se.s SLAVE         
> 
> short.101.q at ts101-1-0.sss.se.s SLAVE         
> 
> short.101.q at ts101-1-0.sss.se.s SLAVE         
>     272 0.55500 slot-alloc sssler       r     07/17/2007 16:39:01
> master.101.q at ts101-1-0.sss.se. MASTER 
> 
> 
> Arghh!
> 
> /Erik

I would claim that this happens because you still call your slave cluster
queue "slave".  Change this cluster queue name to "aslave" and see
what happens. 

Paul

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
For additional commands, e-mail: users-help at gridengine.sunsource.net




More information about the gridengine-users mailing list