[GE users] SGE6.0u6, queue instance seq_no

Reuti reuti at staff.uni-marburg.de
Fri Jan 27 15:35:52 GMT 2006


Hi,

Am 27.01.2006 um 15:20 schrieb Mark Dixon:

> Hi,
>
> I'm having a bit of difficulty in sorting the instances within a  
> cluster queue by sequence number and I was wondering if someone  
> could check that what I'm doing isn't obviously wrong. I've looked  
> at the mailing list archives for the last year or so and seem to be  
> doing everything as suggested by people.
>
> I'm using SGE 6.0u6 on AMD Opteron.
>
> - I have asked the scheduler to sort by sequence number
>
>   $ qconf -ssconf | grep seqno
>   queue_sort_method                 seqno
>
> - I have set the sequence number of each individual instance within a
>   particular cluster queue:
>
>   $ qconf -sq bigmem.q | head
>   qname                 bigmem.q
>   hostlist              @cloud
>   seq_no                0,[cloud00.everest.leeds.ac.uk=1], \

are there in total more hosts than 7 in bigmem.q? Then I'd suggest to  
use e.g. 99 as default instead of 0. - Reuti

>                         [cloud01.everest.leeds.ac.uk=2], \
>                         [cloud02.everest.leeds.ac.uk=3], \
>                         [cloud03.everest.leeds.ac.uk=4], \
>                         [cloud04.everest.leeds.ac.uk=5], \
>                         [cloud05.everest.leeds.ac.uk=6], \
>                         [cloud06.everest.leeds.ac.uk=7]
>   load_thresholds       np_load_avg=1.75
>
> - This seems to be acceptable to SGE:
>
>   $ qconf -sq bigmem.q at cloud02.everest.leeds.ac.uk | head
>   qname                 bigmem.q
>   hostname              cloud02.everest.leeds.ac.uk
>   seq_no                3
>   load_thresholds       np_load_avg=1.75
>   suspend_thresholds    NONE
>   nsuspend              1
>   suspend_interval      00:05:00
>   priority              0
>   min_cpu_interval      00:05:00
>   processors            UNDEFINED
>
> - I then submit several shell scripts, with just a "sleep 300" in  
> them, at
>   once. The queue instances have >1 slots each. Some jobs end up on  
> the
>   same host, but in general they seem to be scattered across the  
> instances
>   within bigmem.q as though they were sorted by load. I was  
> expecting them
>   to fill-up the first node, then the second, etc.
>
> - Creating three queues with a single instance within each, with each
>   queue a different sequence number, gives the expected behaviour: the
>   slots on the low-numbered hosts are completely filled before  
> moving onto
>   the next one.
>
> - There doesn't appear to be any bugs fixed post-6.0u6 relating to  
> this
>   issue.
>
> Does anyone know what I'm doing wrong?
>
> Thanks,
>
> Mark
> -- 
> -----------------------------------------------------------------
> Mark Dixon                       Email    : m.c.dixon at leeds.ac.uk
> Unix team                        Tel (int): 35429
> Information Systems Services     Tel (ext): 0113 343 5429
> University of Leeds, LS2 9JT, UK
> -----------------------------------------------------------------
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
> For additional commands, e-mail: users-help at gridengine.sunsource.net

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
For additional commands, e-mail: users-help at gridengine.sunsource.net




More information about the gridengine-users mailing list