[GE users] scheduling oddity/bug when a complex attribute is compared to 0 as a queue Load Threshold

txema_heredia txema.heredia at upf.edu
Thu Sep 3 15:22:29 BST 2009


    [ The following text is in the "utf-8" character set. ]
    [ Your display is set for the "ISO-8859-10" character set.  ]
    [ Some characters may be displayed incorrectly. ]

Hi all

I think I've found a bug/oddity in SGE 6.1u4. I don't know if it has been already reported or solved in any subsequent releases (I've searched the discussion lists and the release notes but couldn't find anything).


I have configured my sge, and added several consumable attributes that I've assigned to the nodes. Specifically the two I want to work now are "num_jobs" and "medium_jobs", and they are configured as follows:

# qconf -sc
#name               shortcut       type        relop requestable consumable default  urgency
#--------------------------------------------------------------------------------------------
...
medium_jobs         med_jobs       INT         <=    YES         YES        0        0
...
num_jobs            n_jobs         INT         <=    YES         YES        0        0
...


I have also set the value of those two consumable resources in the execution hosts (both variables has 8 in both hosts)

# qconf -se compute-0-3.local
hostname              compute-0-3.local
load_scaling          NONE
complex_values        medium_jobs=8,slow_jobs=8,fast_medium_jobs=8, \
                      fast_slow_jobs=8,medium_slow_jobs=8,num_jobs=8,fast_jobs=8
load_values           arch=lx26-amd64,num_proc=8,mem_total=8985.105469M, \
                      swap_total=20002.796875M,virtual_total=28987.902344M, \
                      load_avg=0.000000,load_short=0.000000, \
                      load_medium=0.000000,load_long=0.000000, \
                      mem_free=8851.113281M,swap_free=19980.285156M, \
                      virtual_free=28831.398438M,mem_used=133.992188M, \
                      swap_used=22.511719M,virtual_used=156.503906M, \
                      cpu=0.100000,np_load_avg=0.000000, \
                      np_load_short=0.000000,np_load_medium=0.000000, \
                      np_load_long=0.000000
processors            8
user_lists            NONE
xuser_lists           NONE
projects              NONE
xprojects             NONE
usage_scaling         NONE
report_variables      mem_free,mem_total,mem_used,swap_free,swap_rate, \
                      swap_rsvd,swap_total,swap_used,virtual_free, \
                      virtual_total,virtual_used

# qconf -se compute-0-4.local
hostname              compute-0-4.local
load_scaling          NONE
complex_values        medium_jobs=8,slow_jobs=8,fast_medium_jobs=8, \
                      fast_slow_jobs=8,medium_slow_jobs=8,num_jobs=8, \
                      fast_jobs=8,num_jobs2=8
load_values           arch=lx26-amd64,num_proc=8,mem_total=8985.105469M, \
                      swap_total=20002.796875M,virtual_total=28987.902344M, \
                      load_avg=0.000000,load_short=0.000000, \
                      load_medium=0.000000,load_long=0.000000, \
                      mem_free=8849.539062M,swap_free=19981.144531M, \
                      virtual_free=28830.683594M,mem_used=135.566406M, \
                      swap_used=21.652344M,virtual_used=157.218750M, \
                      cpu=0.100000,np_load_avg=0.000000, \
                      np_load_short=0.000000,np_load_medium=0.000000, \
                      np_load_long=0.000000
processors            8
user_lists            NONE
xuser_lists           NONE
projects              NONE
xprojects             NONE
usage_scaling         NONE
report_variables      NONE





My problem comes here. When I want to run some (in this example, 20) jobs which consume one of each two attributes (i.e. qsub .... -l medium_jobs=1 -l num_jobs=1 ... ) in a queue which has a Load Threshold set as num_jobs <= 0, the scheduler does a STUPID thing:

It sees all the jobs in "wq" state. The scheduler runs. Schedules 2 jobs for each execution node, and then stops submiting jobs to the hosts due to an unknown reason (magic, maybe?).
30 seconds later (the scheduling interval in this example, I've tried it with other numbers), the scheduler runs again and now, it schedules the rest of the jobs (which it should have scheduled the first time before stopping).




This is the command I use to submit the job:

qsub -m n -q test-medium-med -N medium-$i  -l medium_jobs=1 -l num_jobs=1 -e /dev/null -o /dev/null -S php prova_cues/proces.php


This is the queue configuration:

# qconf -sq test-medium-med
qname                 test-medium-med
hostlist              @med
seq_no                0
load_thresholds       num_jobs=0                <---------
suspend_thresholds    NONE
nsuspend              1
suspend_interval      00:05:00
priority              0
min_cpu_interval      00:05:00
processors            UNDEFINED
qtype                 BATCH INTERACTIVE
ckpt_list             NONE
pe_list               make
rerun                 FALSE
slots                 8
tmpdir                /tmp
shell                 /bin/bash
prolog                NONE
epilog                NONE
shell_start_mode      posix_compliant
starter_method        NONE
suspend_method        NONE
resume_method         NONE
terminate_method      NONE
notify                00:00:60
owner_list            NONE
user_lists            test_users txema
xuser_lists           NONE
subordinate_list      NONE
complex_values        NONE
projects              NONE
xprojects             NONE
calendar              NONE
initial_state         enabled
s_rt                  INFINITY
h_rt                  48:00:00
s_cpu                 INFINITY
h_cpu                 INFINITY
s_fsize               INFINITY
h_fsize               INFINITY
s_data                INFINITY
h_data                INFINITY
s_stack               INFINITY
h_stack               INFINITY
s_core                INFINITY
h_core                INFINITY
s_rss                 INFINITY
h_rss                 INFINITY
s_vmem                INFINITY
h_vmem                INFINITY


And here, the output of qstat:


# qstat -u theredia | sort -k 1
job-ID  prior   name       user         state submit/start at     queue                          slots ja-task-ID
-----------------------------------------------------------------------------------------------------------------
 734414 0.00000 medium-0   theredia     qw    09/03/2009 15:03:54                                    1
 734415 0.00000 medium-1   theredia     qw    09/03/2009 15:03:54                                    1
 734416 0.00000 medium-2   theredia     qw    09/03/2009 15:03:54                                    1
 734417 0.00000 medium-3   theredia     qw    09/03/2009 15:03:54                                    1
 734418 0.00000 medium-4   theredia     qw    09/03/2009 15:03:54                                    1
 734419 0.00000 medium-5   theredia     qw    09/03/2009 15:03:55                                    1
 734420 0.00000 medium-6   theredia     qw    09/03/2009 15:03:55                                    1
 734421 0.00000 medium-7   theredia     qw    09/03/2009 15:03:55                                    1
 734422 0.00000 medium-8   theredia     qw    09/03/2009 15:03:55                                    1
 734423 0.00000 medium-9   theredia     qw    09/03/2009 15:03:55                                    1
 734424 0.00000 medium-10  theredia     qw    09/03/2009 15:03:55                                    1
 734425 0.00000 medium-11  theredia     qw    09/03/2009 15:03:55                                    1
 734426 0.00000 medium-12  theredia     qw    09/03/2009 15:03:55                                    1
 734427 0.00000 medium-13  theredia     qw    09/03/2009 15:03:55                                    1
 734428 0.00000 medium-14  theredia     qw    09/03/2009 15:03:56                                    1
 734429 0.00000 medium-15  theredia     qw    09/03/2009 15:03:56                                    1
 734430 0.00000 medium-16  theredia     qw    09/03/2009 15:03:56                                    1
 734431 0.00000 medium-17  theredia     qw    09/03/2009 15:03:56                                    1
 734432 0.00000 medium-18  theredia     qw    09/03/2009 15:03:56                                    1
 734433 0.00000 medium-19  theredia     qw    09/03/2009 15:03:56                                    1



# qstat -u theredia | sort -k 1
job-ID  prior   name       user         state submit/start at     queue                          slots ja-task-ID
-----------------------------------------------------------------------------------------------------------------
 734414 0.56000 medium-0   theredia     r     09/03/2009 15:04:04 test-medium-med at compute-0-3.lo     1
 734415 0.55500 medium-1   theredia     r     09/03/2009 15:04:04 test-medium-med at compute-0-4.lo     1
 734416 0.55333 medium-2   theredia     r     09/03/2009 15:04:04 test-medium-med at compute-0-4.lo     1
 734417 0.55250 medium-3   theredia     r     09/03/2009 15:04:04 test-medium-med at compute-0-3.lo     1
////////////////////////////////////////////////////////////////////////////////////////////////////////
//              the first time scheduling sees the jobs. It will run again 30 sec later                         //
////////////////////////////////////////////////////////////////////////////////////////////////////////
 734418 0.55200 medium-4   theredia     qw    09/03/2009 15:03:54                                    1
 734419 0.55167 medium-5   theredia     qw    09/03/2009 15:03:55                                    1
 734420 0.55143 medium-6   theredia     qw    09/03/2009 15:03:55                                    1
 734421 0.55125 medium-7   theredia     qw    09/03/2009 15:03:55                                    1
 734422 0.55111 medium-8   theredia     qw    09/03/2009 15:03:55                                    1
 734423 0.55100 medium-9   theredia     qw    09/03/2009 15:03:55                                    1
 734424 0.55091 medium-10  theredia     qw    09/03/2009 15:03:55                                    1
 734425 0.55083 medium-11  theredia     qw    09/03/2009 15:03:55                                    1
 734426 0.55077 medium-12  theredia     qw    09/03/2009 15:03:55                                    1
 734427 0.55071 medium-13  theredia     qw    09/03/2009 15:03:55                                    1
 734428 0.55067 medium-14  theredia     qw    09/03/2009 15:03:56                                    1
 734429 0.55063 medium-15  theredia     qw    09/03/2009 15:03:56                                    1
 734430 0.55059 medium-16  theredia     qw    09/03/2009 15:03:56                                    1
 734431 0.55056 medium-17  theredia     qw    09/03/2009 15:03:56                                    1
 734432 0.55053 medium-18  theredia     qw    09/03/2009 15:03:56                                    1
 734433 0.55050 medium-19  theredia     qw    09/03/2009 15:03:56                                    1



# qstat -u theredia | sort -k 1
job-ID  prior   name       user         state submit/start at     queue                          slots ja-task-ID
-----------------------------------------------------------------------------------------------------------------
 734414 0.56000 medium-0   theredia     r     09/03/2009 15:04:04 test-medium-med at compute-0-3.lo     1
 734415 0.56000 medium-1   theredia     r     09/03/2009 15:04:04 test-medium-med at compute-0-4.lo     1
 734416 0.56000 medium-2   theredia     r     09/03/2009 15:04:04 test-medium-med at compute-0-4.lo     1
 734417 0.56000 medium-3   theredia     r     09/03/2009 15:04:04 test-medium-med at compute-0-3.lo     1
////////////////////////////////////////////////////////////////////////////////////////////////////////
//                                              scheduling ran again 30 sec later                                               //
////////////////////////////////////////////////////////////////////////////////////////////////////////
 734418 0.55200 medium-4   theredia     r     09/03/2009 15:04:34 test-medium-med at compute-0-4.lo     1
 734419 0.55167 medium-5   theredia     r     09/03/2009 15:04:34 test-medium-med at compute-0-3.lo     1
 734420 0.55143 medium-6   theredia     r     09/03/2009 15:04:34 test-medium-med at compute-0-3.lo     1
 734421 0.55125 medium-7   theredia     r     09/03/2009 15:04:34 test-medium-med at compute-0-4.lo     1
 734422 0.55111 medium-8   theredia     r     09/03/2009 15:04:34 test-medium-med at compute-0-4.lo     1
 734423 0.55100 medium-9   theredia     r     09/03/2009 15:04:34 test-medium-med at compute-0-3.lo     1
 734424 0.55091 medium-10  theredia     r     09/03/2009 15:04:34 test-medium-med at compute-0-3.lo     1
 734425 0.55083 medium-11  theredia     r     09/03/2009 15:04:34 test-medium-med at compute-0-4.lo     1
 734426 0.55077 medium-12  theredia     r     09/03/2009 15:04:34 test-medium-med at compute-0-4.lo     1
 734427 0.55071 medium-13  theredia     r     09/03/2009 15:04:34 test-medium-med at compute-0-3.lo     1
 734428 0.55067 medium-14  theredia     r     09/03/2009 15:04:34 test-medium-med at compute-0-3.lo     1
 734429 0.55063 medium-15  theredia     r     09/03/2009 15:04:34 test-medium-med at compute-0-4.lo     1
 734430 0.55059 medium-16  theredia     qw    09/03/2009 15:03:56                                    1
 734431 0.55056 medium-17  theredia     qw    09/03/2009 15:03:56                                    1
 734432 0.55053 medium-18  theredia     qw    09/03/2009 15:03:56                                    1
 734433 0.55050 medium-19  theredia     qw    09/03/2009 15:03:56                                    1





As you can see 4 jobs (2x host) ran the first time, and the other 12 (6x host) were submited 30 seconds later.


I have tested a bit more and I have realized that this is getting mad:

If instead of using "num_jobs" in the Load Threshold, I use the "medium_jobs" attibute, then...




# qstat -u theredia | sort -k 1
job-ID  prior   name       user         state submit/start at     queue                          slots ja-task-ID
-----------------------------------------------------------------------------------------------------------------
 734434 0.00000 medium-0   theredia     qw    09/03/2009 15:18:17                                    1
 734435 0.00000 medium-1   theredia     qw    09/03/2009 15:18:17                                    1
 734436 0.00000 medium-2   theredia     qw    09/03/2009 15:18:17                                    1
 734437 0.00000 medium-3   theredia     qw    09/03/2009 15:18:17                                    1
 734438 0.00000 medium-4   theredia     qw    09/03/2009 15:18:17                                    1
 734439 0.00000 medium-5   theredia     qw    09/03/2009 15:18:17                                    1
 734440 0.00000 medium-6   theredia     qw    09/03/2009 15:18:17                                    1
 734441 0.00000 medium-7   theredia     qw    09/03/2009 15:18:17                                    1
 734442 0.00000 medium-8   theredia     qw    09/03/2009 15:18:17                                    1
 734443 0.00000 medium-9   theredia     qw    09/03/2009 15:18:18                                    1
 734444 0.00000 medium-10  theredia     qw    09/03/2009 15:18:18                                    1
 734445 0.00000 medium-11  theredia     qw    09/03/2009 15:18:18                                    1
 734446 0.00000 medium-12  theredia     qw    09/03/2009 15:18:18                                    1
 734447 0.00000 medium-13  theredia     qw    09/03/2009 15:18:18                                    1
 734448 0.00000 medium-14  theredia     qw    09/03/2009 15:18:18                                    1
 734449 0.00000 medium-15  theredia     qw    09/03/2009 15:18:18                                    1
 734450 0.00000 medium-16  theredia     qw    09/03/2009 15:18:18                                    1
 734451 0.00000 medium-17  theredia     qw    09/03/2009 15:18:18                                    1
 734452 0.00000 medium-18  theredia     qw    09/03/2009 15:18:19                                    1
 734453 0.00000 medium-19  theredia     qw    09/03/2009 15:18:19                                    1




# qstat -u theredia | sort -k 1
job-ID  prior   name       user         state submit/start at     queue                          slots ja-task-ID
-----------------------------------------------------------------------------------------------------------------
 734434 0.56000 medium-0   theredia     r     09/03/2009 15:18:34 test-medium-med at compute-0-3.lo     1
 734435 0.55500 medium-1   theredia     r     09/03/2009 15:18:34 test-medium-med at compute-0-4.lo     1
 734436 0.55333 medium-2   theredia     r     09/03/2009 15:18:34 test-medium-med at compute-0-4.lo     1
 734437 0.55250 medium-3   theredia     r     09/03/2009 15:18:34 test-medium-med at compute-0-3.lo     1
 734438 0.55200 medium-4   theredia     r     09/03/2009 15:18:34 test-medium-med at compute-0-3.lo     1
 734439 0.55167 medium-5   theredia     r     09/03/2009 15:18:34 test-medium-med at compute-0-4.lo     1
 734440 0.55143 medium-6   theredia     r     09/03/2009 15:18:34 test-medium-med at compute-0-4.lo     1
 734441 0.55125 medium-7   theredia     r     09/03/2009 15:18:34 test-medium-med at compute-0-3.lo     1
 734442 0.55111 medium-8   theredia     r     09/03/2009 15:18:34 test-medium-med at compute-0-3.lo     1
 734443 0.55100 medium-9   theredia     r     09/03/2009 15:18:34 test-medium-med at compute-0-4.lo     1
 734444 0.55091 medium-10  theredia     r     09/03/2009 15:18:34 test-medium-med at compute-0-4.lo     1
 734445 0.55083 medium-11  theredia     r     09/03/2009 15:18:34 test-medium-med at compute-0-3.lo     1
 734446 0.55077 medium-12  theredia     r     09/03/2009 15:18:34 test-medium-med at compute-0-3.lo     1
 734447 0.55071 medium-13  theredia     r     09/03/2009 15:18:34 test-medium-med at compute-0-4.lo     1
 734448 0.55067 medium-14  theredia     r     09/03/2009 15:18:34 test-medium-med at compute-0-4.lo     1
 734449 0.55063 medium-15  theredia     r     09/03/2009 15:18:34 test-medium-med at compute-0-3.lo     1
 734450 0.55059 medium-16  theredia     qw    09/03/2009 15:18:18                                    1
 734451 0.55056 medium-17  theredia     qw    09/03/2009 15:18:18                                    1
 734452 0.55053 medium-18  theredia     qw    09/03/2009 15:18:19                                    1
 734453 0.55050 medium-19  theredia     qw    09/03/2009 15:18:19                                    1





...IT WORKS FINE !!!

I've tried with "num_jobs2" and "fast_medium_jobs" and they worked too. I've also tried with "num_jobs=1", and it works OK too, so it seems to be an oddity related with my complex attribute's name only when it is compared with 0 (but much before it reaches 0 ?? ?? )



This might seem trivial, but It happened me when I was trying to submit the job not to one, but 3 different queues, in order to run first in this queue (the one of this example) and if this was full, run it in the other two (I configured the seqno and such), and it was driving me mad 'till I isolated the problem.

Now I can solve this easily either creating a new consumable resource, or don't using a load threshold with a consumable resource=0, because the default scheduling behaviour will not allocate jobs to host which doesn't have enough resources for the job.

------------------------------------------------------
http://gridengine.sunsource.net/ds/viewMessage.do?dsForumId=38&dsMessageId=215631

To unsubscribe from this discussion, e-mail: [users-unsubscribe at gridengine.sunsource.net].



More information about the gridengine-users mailing list