[GE users] SGE6.0u6, queue instance seq_no

Mark Dixon m.c.dixon at leeds.ac.uk
Fri Jan 27 14:20:50 GMT 2006


I'm having a bit of difficulty in sorting the instances within a cluster 
queue by sequence number and I was wondering if someone could check that 
what I'm doing isn't obviously wrong. I've looked at the mailing list 
archives for the last year or so and seem to be doing everything as 
suggested by people.

I'm using SGE 6.0u6 on AMD Opteron.

- I have asked the scheduler to sort by sequence number

   $ qconf -ssconf | grep seqno
   queue_sort_method                 seqno

- I have set the sequence number of each individual instance within a
   particular cluster queue:

   $ qconf -sq bigmem.q | head
   qname                 bigmem.q
   hostlist              @cloud
   seq_no                0,[cloud00.everest.leeds.ac.uk=1], \
                         [cloud01.everest.leeds.ac.uk=2], \
                         [cloud02.everest.leeds.ac.uk=3], \
                         [cloud03.everest.leeds.ac.uk=4], \
                         [cloud04.everest.leeds.ac.uk=5], \
                         [cloud05.everest.leeds.ac.uk=6], \
   load_thresholds       np_load_avg=1.75

- This seems to be acceptable to SGE:

   $ qconf -sq bigmem.q at cloud02.everest.leeds.ac.uk | head
   qname                 bigmem.q
   hostname              cloud02.everest.leeds.ac.uk
   seq_no                3
   load_thresholds       np_load_avg=1.75
   suspend_thresholds    NONE
   nsuspend              1
   suspend_interval      00:05:00
   priority              0
   min_cpu_interval      00:05:00
   processors            UNDEFINED

- I then submit several shell scripts, with just a "sleep 300" in them, at
   once. The queue instances have >1 slots each. Some jobs end up on the
   same host, but in general they seem to be scattered across the instances
   within bigmem.q as though they were sorted by load. I was expecting them
   to fill-up the first node, then the second, etc.

- Creating three queues with a single instance within each, with each
   queue a different sequence number, gives the expected behaviour: the
   slots on the low-numbered hosts are completely filled before moving onto
   the next one.

- There doesn't appear to be any bugs fixed post-6.0u6 relating to this

Does anyone know what I'm doing wrong?


Mark Dixon                       Email    : m.c.dixon at leeds.ac.uk
Unix team                        Tel (int): 35429
Information Systems Services     Tel (ext): 0113 343 5429
University of Leeds, LS2 9JT, UK

To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
For additional commands, e-mail: users-help at gridengine.sunsource.net

More information about the gridengine-users mailing list