[GE users] master node selection and $fill_up behaviour

weiser m.weiser at science-computing.de
Thu Jul 22 11:02:00 BST 2010


Hi Andy,

On Thu, Jul 22, 2010 at 09:50:48AM +0200, Andy Schwierskott wrote:

> If queue_sort_method is seq_no then the second sort criterion is the  
> load (according to the load_formula). And still the load_adjustments  
> will apply.

> Without looking at the complete picture it indeed my look erratic.

> SGE internally uses load values with a few more digits as you see in a  
> qstat/qhost (do a qconf -se <hostname>). That's another source which may  
> make the scheduling decisions erratic.

How do you explain the behaviour below then: Submitting onto an empty
cluster with pauses of 10 minutes inbetween submits to allow the load
values to level out, I get the second job assigned to the most highly
loaded node.

scmic at l5-auto-du ~ $ while [ 1 ] ; do echo "while [ 1 ] ; do true ;
done" | qsub --version 6.2u5 -pe dmp 3 ; sleep 600 ; done 
Your job 346 ("STDIN") has been submitted
Your job 347 ("STDIN") has been submitted

scmic at l5-auto-du ~ $ qhost -j --version 6.2u5
HOSTNAME                ARCH         NCPU  LOAD  MEMTOT  MEMUSE  SWAPTO  SWAPUS
-------------------------------------------------------------------------------
global                  -               -     -       -       -       -       -
l5-node01               lx26-amd64      4  0.02    7.9G  383.9M    9.8G     0.0
l5-node02               lx26-amd64      4  0.04    7.9G  374.5M    9.8G     0.0
   job-ID  prior   name       user         state submit/start at     queue      master ja-task-ID
   ----------------------------------------------------------------------------------------------
       347 0.55500 STDIN      scmic        r     07/22/2010 10:24:41 express-dm SLAVE
                                                                     express-dm SLAVE
l5-node03               lx26-amd64      4  0.01    7.9G  383.1M    9.8G     0.0
l5-node04               lx26-amd64      4  0.53    7.9G  392.4M    9.8G     0.0
l5-node05               lx26-amd64      4  0.02    7.9G  378.3M    9.8G     0.0
l5-node06               lx26-amd64      4  0.03    7.9G  390.5M    9.8G     0.0
l5-node07               lx26-amd64      4  1.16    7.9G  386.5M    9.8G     0.0
       346 0.55500 STDIN      scmic        r     07/22/2010 10:14:41 express-dm MASTER
                                                                     express-dm SLAVE
                                                                     express-dm SLAVE
       347 0.55500 STDIN      scmic        r     07/22/2010 10:24:41 express-dm MASTER
l5-node08               lx26-amd64      4  0.03    7.9G  372.7M    9.8G     0.0

After another while it looks like this:

scmic at l5-auto-du ~ $ while [ 1 ] ; do echo "while [ 1 ] ; do true ;
done" | qsub --version 6.2u5 -pe dmp 3 ; sleep 600 ; done 
Your job 346 ("STDIN") has been submitted
Your job 347 ("STDIN") has been submitted
Your job 348 ("STDIN") has been submitted
Your job 349 ("STDIN") has been submitted
Your job 350 ("STDIN") has been submitted

cmic at l5-auto-du ~ $ qhost -j --version 6.2u5
HOSTNAME                ARCH         NCPU  LOAD  MEMTOT  MEMUSE  SWAPTO  SWAPUS
-------------------------------------------------------------------------------
global                  -               -     -       -       -       -       -
l5-node01               lx26-amd64      4  0.02    7.9G  387.2M    9.8G     0.0
l5-node02               lx26-amd64      4  1.03    7.9G  375.0M    9.8G     0.0
   job-ID  prior   name       user         state submit/start at     queue      master ja-task-ID
   ----------------------------------------------------------------------------------------------
       347 0.55500 STDIN      scmic        r     07/22/2010 10:24:41 express-dm SLAVE
                                                                     express-dm SLAVE
       348 0.55500 STDIN      scmic        r     07/22/2010 10:34:41 express-dm MASTER
                                                                     express-dm SLAVE
l5-node03               lx26-amd64      4  0.01    7.9G  379.7M    9.8G     0.0
l5-node04               lx26-amd64      4  0.04    7.9G  389.4M    9.8G     0.0
l5-node05               lx26-amd64      4  0.04    7.9G  378.3M    9.8G     0.0
l5-node06               lx26-amd64      4  0.92    7.9G  393.2M    9.8G     0.0
       348 0.55500 STDIN      scmic        r     07/22/2010 10:34:41 express-dm SLAVE
       349 0.55500 STDIN      scmic        r     07/22/2010 10:44:41 express-dm MASTER
                                                                     express-dm SLAVE
                                                                     express-dm SLAVE
l5-node07               lx26-amd64      4  2.01    7.9G  388.4M    9.8G     0.0
       346 0.55500 STDIN      scmic        r     07/22/2010 10:14:41 express-dm MASTER
                                                                     express-dm SLAVE
                                                                     express-dm SLAVE
       347 0.55500 STDIN      scmic        r     07/22/2010 10:24:41 express-dm MASTER
l5-node08               lx26-amd64      4  0.28    7.9G  372.6M    9.8G     0.0
       350 0.55500 STDIN      scmic        r     07/22/2010 10:54:42 express-dm MASTER
                                                                     express-dm SLAVE
                                                                     express-dm SLAVE

Again, there are three nodes which aren't touched, although they're
idling.

The relevant scheduler config entries are:

schedule_interval                 0:0:15
queue_sort_method                 seqno
job_load_adjustments              np_load_avg=0.50
load_adjustment_decay_time        0:7:30
load_formula                      np_load_avg
flush_submit_sec                  1
flush_finish_sec                  1
reprioritize_interval             0:0:0

The queue definition includes:

qname                 express-dmp
seq_no                0
load_thresholds       np_load_avg=1.75
pe_list               dmp smp

The PE definiton includes:

pe_name            dmp
slots              999
allocation_rule    $fill_up

Thanks,
-- 
Michael Weiser                science + computing ag
Senior Systems Engineer       Geschaeftsstelle Duesseldorf
                              Martinstrasse 47-55, Haus A
phone: +49 211 302 708 32     D-40223 Duesseldorf
fax:   +49 211 302 708 50     www.science-computing.de
-- 
Vorstand/Board of Management:
Dr. Bernd Finkbeiner, Dr. Roland Niemeier, 
Dr. Arno Steitz, Dr. Ingrid Zech
Vorsitzender des Aufsichtsrats/
Chairman of the Supervisory Board:
Michel Lepert
Sitz/Registered Office: Tuebingen
Registergericht/Registration Court: Stuttgart
Registernummer/Commercial Register No.: HRB 382196

------------------------------------------------------
http://gridengine.sunsource.net/ds/viewMessage.do?dsForumId=38&dsMessageId=269653

To unsubscribe from this discussion, e-mail: [users-unsubscribe at gridengine.sunsource.net].



More information about the gridengine-users mailing list