[GE users] Sort by sequence number question

Reuti reuti at staff.uni-marburg.de
Tue Jul 17 16:41:22 BST 2007


    [ The following text is in the "ISO-8859-1" character set. ]
    [ Your display is set for the "ISO-8859-10" character set.  ]
    [ Some special characters may be displayed incorrectly. ]

Am 17.07.2007 um 17:04 schrieb Paul MacInnis:

> Hi Erik,
>
> On Tue, 17 Jul 2007, Erik [ISO-8859-1] Lönroth wrote:
>
>>
>>
>> On tis, 2007-07-17 at 16:12 +0200, Andreas.Haas at Sun.COM wrote:
>>> Hi Paul,
>>>
>>> On Mon, 16 Jul 2007, Paul MacInnis wrote:
>>>
>>>> On Wed, 11 Jul 2007, Paul MacInnis wrote:
>>>>
>>>>> On Wed, 11 Jul 2007, [iso-8859-1] LÃ?nroth Erik wrote:
>>>>>
>>>>>> yes, it is set. Still no luck on this.
>>>>>>
>>>>>> The only way I can force the damn slaves off the MASTER node,  
>>>>>> is to remove the requested "PE" explicitly from the nodes.  
>>>>>> This is not what I would want, but I just can't make it  
>>>>>> happen. It simply ignores my sequence number alltogether. I  
>>>>>> have recreated all queues and restarted the qmaster and  
>>>>>> scheduler, but no luck whatsoever.
>>>>>>
>>>>>> Is there something else affecting the effect of "sequence  
>>>>>> number" outside of the general queue configuration and the  
>>>>>> cluster config?
>>>>>
>>>>> I would like to add our experience to this discussion.
>>>>>
>>>>> We recently switched from SGE5.3 to 6.1.  We have 1G, 2G and 4G  
>>>>> nodes in
>>>>> our cluster.  If a job doesn't specify special memory requirements
>>>>> we want it scheduled to the smallest memory machine available.
>>>>>
>>>>> For 4 years with SGE5.3 this worked well.  We assigned sequence  
>>>>> number
>>>>> 1965 to the 1G nodes, 2965 to the 2G nodes and 4965 to the 4G  
>>>>> nodes.
>>>>>
>>>>> With SGE6.1 we defined our queues with
>>>>> seq_no  1965,[@2g.hg=2965],[@4g.hg=4965]
>>>>>
>>>>> @2g.hg being the 2G host group nodes and @4g.hg being the 4G  
>>>>> nodes.
>>>>>
>>>>> With qconf -msconf we defined:
>>>>> queue_sort_method     seqno
>>>>>
>>>>> qstat -F presents the queues correctly ordered by this seqno.   
>>>>> However
>>>>> jobs are being scheduled to 2G and 4G nodes when there are 1G  
>>>>> nodes
>>>>> available!
>>>>>
>>>>> This never happened in SGE5.3!
>>>>>
>>>>> It seems that in 6.1 either
>>>>> 1. "queue_sort_method  seqno" isn't working for queue selection or
>>>>> 2. there is some other queue selection criteria that overrides
>>>>>    "queue_sort_method  seqno"
>>>>>
>>>>> Any thoughts?
>>>>>
>>>>> Paul
>>>>
>>>> Here's what seems to be happening.
>>>>
>>>> For serial jobs we have 2 cluster queues:  ser.q    bg.q
>>>>
>>>> ser.q is the main serial queue; bg.q (priority 19) is meant to  
>>>> be used
>>>> if only when the load on a node (load_avg and mem_used) is  
>>>> unexpectedly
>>>> light.  Generally same nodes are assigned to each cluster queue.
>>>>
>>>> seq.q uses seqno 1965, 2965 and 4965 for its 1G, 2G and 4G nodes.
>>>>
>>>> bg.q uses seqno 2969 and 4969 for its 2G and 4G nodes (no 1G  
>>>> nodes).
>>>>
>>>> The intention is that when a serial job appears nodes would be  
>>>> considered
>>>> in this order:
>>>>
>>>> 1G seq.q,  2G seq.q,  2G bg.q,  4G seq.q  4G bg.q
>>>>
>>>> However what's happening seems to be this order:
>>>>
>>>> 1G seq.q,  2G bg.q,  4G bg.q,  2G seq.q,  4G seq.q
>>>>
>>>> It seems that for scheduling cluster queues are considered first in
>>>> alphabetical order, and then only within the cluster queue queue
>>>> instances are considered in seqno order!
>>>>
>>>> qstat however presents queue instances as intended - strictly by
>>>> seqno.
>>>>
>>>> Is the solution to name our cluster queues to alphabetically  
>>>> match the
>>>> order we wish them considered by the scheduler?  Or is there  
>>>> some other
>>>> setting that we've missed?
>>>
>>> I can not reproduce this. Here is my queue set-up:
>>>
>>>> qconf -ssconf | grep sort
>>>     queue_sort_method                 seqno
>>>
>>>> qconf -shgrp @oneG
>>>     group_name @oneG
>>>     hostlist angbor
>>>
>>>> qconf -shgrp @twoG
>>>     group_name @twoG
>>>     hostlist es-ergb01-01
>>>
>>>> qconf -shgrp @fourG
>>>     group_name @fourG
>>>     hostlist baumbart
>>>
>>>> qconf -sq test_ser.q | egrep "hostlist|seq|load_thre|slots"
>>>     hostlist              @oneG @twoG @fourG
>>>     seq_no                0,[@oneG=1965],[@twoG=2965],[@fourG=4965]
>>>     load_thresholds       NONE
>>>     slots                 1
>>>
>>>> qconf -sq test_bg.q | egrep "hostlist|seq|load_thre|slots"
>>>     hostlist              @twoG @fourG
>>>     seq_no                0,[@twoG=2969],[@fourG=4969]
>>>     load_thresholds       NONE
>>>     slots                 1
>>>
>>> when I submit
>>>
>>>> qsub -t 1-5 -q 'test_*' -b y /sleep 5
>>>     Your job-array 528.1-5:1 ("sleep") has been submitted
>>>
>>> I get queues filled in the order of the array task indices
>>>
>>>> qstat -f -q 'test_*'
>>>     queuename                      qtype used/tot. load_avg  
>>> arch          states
>>>      
>>> -------------------------------------------------------------------- 
>>> --------
>>>     test_ser.q at angbor              BIP   1/1       0.04     lx24-x86
>>>         528 0.55500 sleep      ah114088     t     07/17/2007  
>>> 16:02:19     1 1
>>>      
>>> -------------------------------------------------------------------- 
>>> --------
>>>     test_ser.q at es-ergb01-01        BIP   1/1       0.42     sol- 
>>> sparc64
>>>         528 0.55500 sleep      ah114088     t     07/17/2007  
>>> 16:02:19     1 2
>>>      
>>> -------------------------------------------------------------------- 
>>> --------
>>>     test_bg.q at es-ergb01-01         BIP   1/1       0.42     sol- 
>>> sparc64
>>>         528 0.55500 sleep      ah114088     t     07/17/2007  
>>> 16:02:19     1 3
>>>      
>>> -------------------------------------------------------------------- 
>>> --------
>>>     test_ser.q at baumbart            BIP   1/1       0.19     irix65
>>>         528 0.55500 sleep      ah114088     t     07/17/2007  
>>> 16:02:19     1 4
>>>      
>>> -------------------------------------------------------------------- 
>>> --------
>>>     test_bg.q at baumbart             BIP   1/1       0.19     irix65
>>>         528 0.55500 sleep      ah114088     t     07/17/2007  
>>> 16:02:19     1 5
>>>
>>> and the same is true with plain sequential jobs
>>>
>>>> ntimes 5 qsub -q 'test_*' -b y /bin/sleep 5
>>>     Your job 534 ("sleep") has been submitted
>>>     Your job 535 ("sleep") has been submitted
>>>     Your job 536 ("sleep") has been submitted
>>>     Your job 537 ("sleep") has been submitted
>>>     Your job 538 ("sleep") has been submitted
>>>
>>>> qstat -f -q 'test_*'
>>>     queuename                      qtype used/tot. load_avg  
>>> arch          states
>>>      
>>> -------------------------------------------------------------------- 
>>> --------
>>>     test_ser.q at angbor              BIP   1/1       0.10     lx24-x86
>>>         534 0.55500 sleep      ah114088     r     07/17/2007  
>>> 16:07:09     1
>>>      
>>> -------------------------------------------------------------------- 
>>> --------
>>>     test_ser.q at es-ergb01-01        BIP   1/1       0.34     sol- 
>>> sparc64
>>>         535 0.55500 sleep      ah114088     r     07/17/2007  
>>> 16:07:09     1
>>>      
>>> -------------------------------------------------------------------- 
>>> --------
>>>     test_bg.q at es-ergb01-01         BIP   1/1       0.34     sol- 
>>> sparc64
>>>         536 0.55500 sleep      ah114088     t     07/17/2007  
>>> 16:07:09     1
>>>      
>>> -------------------------------------------------------------------- 
>>> --------
>>>     test_ser.q at baumbart            BIP   1/1       0.19     irix65
>>>         537 0.55500 sleep      ah114088     t     07/17/2007  
>>> 16:07:09     1
>>>      
>>> -------------------------------------------------------------------- 
>>> --------
>>>     test_bg.q at baumbart             BIP   1/1       0.19     irix65
>>>         538 0.55500 sleep      ah114088     t     07/17/2007  
>>> 16:07:09     1
>>>
>>> I did this with N1GE 6.1
>>>
>>> Could it be that jobs are submitted with -soft option as to  
>>> specify some
>>> preferece? Or are you using some over-sensitive load thresholds?
>>>
>>> Regards,
>>> Andreas
>>>
>>> -------------------------------------------------------------------- 
>>> -
>>> To unsubscribe, e-mail: users-unsubscribe at gridengine..sunsource.net
>>> For additional commands, e-mail: users-help at gridengine.sunsource.net
>>
>>
>> My identical outputs (as they apply, I dont have that many HG:s):
>>
>>
>> # sort method
>> qconf -ssconf | grep sort
>> queue_sort_method                 seqno
>>
>>
>> # my only HG here
>> qconf -shgrp @ts101_X_hg
>> group_name @ts101_X_hg
>> hostlist ts101-1-0.sss.se.scania.com ts101-1-1.sss.se.scania.com
>>
>> # show the only master queue
>> qconf -sq master.101.q | egrep "hostlist|seq|load_thre|slots"
>> hostlist              ts101-1-0.sss.se.scania.com
>> seq_no                1019
>> load_thresholds       np_load_avg=1.75
>> slots                 1
>>
>> # show the "slave"/"short" queue.
>> qconf -sq short.101.q | egrep "hostlist|seq|load_thre|slots"
>> hostlist              @ts101_X_hg
>> seq_no                101,[ts101-1-1.sss.se.scania.com=0], \
>> load_thresholds       np_load_avg=1.75
>> slots                 4
>>
>> (See the "\" charachter last on the line on "seq_no" ? - Dont know it
>> that might be a problem?)
>>
>> Now, this is the complete job (slot-allocation.job):
>>
>> #!/bin/bash
>> #$ -S /bin/bash
>> #$ -N slot-allocation
>> #$ -cwd
>> #$ -o output.$JOB_ID
>> #$ -e errors.$JOB_ID
>> #$ -pe powerflow_*_pe 5
>> #$ -masterq master.*.q
>> echo "Starting on: ${HOSTNAME}"
>> echo "$PE_HOSTFILE contains:"
>> cat $PE_HOSTFILE
>> sleep 30
>>
>> This is the submit:
>>
>> qsub slot-allocation.job
>>
>>
>> And - the tragic output from qstat:
>>
>> qstat -t
>>
>>     272 0.55500 slot-alloc sssler       r     07/17/2007 16:39:01
>> short.101.q at ts101-1-0.sss.se.s SLAVE
>>
>> short.101.q at ts101-1-0.sss.se.s SLAVE
>>
>> short.101.q at ts101-1-0.sss.se.s SLAVE
>>
>> short.101.q at ts101-1-0.sss.se.s SLAVE
>>     272 0.55500 slot-alloc sssler       r     07/17/2007 16:39:01
>> master.101.q at ts101-1-0.sss.se. MASTER
>>
>>
>> Arghh!
>>
>> /Erik

BTW, for parallel jobs I remember this: http:// 
gridengine.sunsource.net/issues/show_bug.cgi?id=1311

-- Reuti


> I would claim that this happens because you still call your slave  
> cluster
> queue "slave".  Change this cluster queue name to "aslave" and see
> what happens.
>
> Paul
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
> For additional commands, e-mail: users-help at gridengine.sunsource.net
>

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
For additional commands, e-mail: users-help at gridengine.sunsource.net




More information about the gridengine-users mailing list