[GE users] Sort by sequence number question

Andreas.Haas at Sun.COM Andreas.Haas at Sun.COM
Thu Jul 12 13:59:45 BST 2007


    [ The following text is in the "ISO-8859-1" character set. ]
    [ Your display is set for the "ISO-8859-10" character set.  ]
    [ Some special characters may be displayed incorrectly. ]

Hi Erik,

I'm sorry, but I checked the 5.3 source code, but I can not find any difference 
in how queues slots are selected with $fill_up being used as allocation rule. 
In both cases the scheduler tries to get as many slots as possible from all master
host queues and then subsequently from other hosts.

In 5.3 queue slot tagging and aggregation occurs in function 
sge_replicate_queues_suitable4job()

    http://gridengine.sunsource.net/source/browse/*checkout*/gridengine/source/libs/sched/sge_select_queue.c?rev=1.9.2.16

in 6.0 you find queue slot tagging code in parallel_tag_queues_suitable4job() and slot 
aggregation in parallel_make_granted_destination_id_list()

    http://gridengine.sunsource.net/source/browse/*checkout*/gridengine/source/libs/sched/sge_select_queue.c?rev=1.130.2.19

Regards,
Andreas

On Thu, 12 Jul 2007, Erik Lönroth wrote:

> Hmmm...
>
> The only way I can get SGE not to schedule jobs on the same node as the
> MASTER process is to use "$round_robin" for my PE as Reuti suggested. I
> really can't see the logics in this tho.
>
> Regardless of how I set sequence number for my master nodes, SGE will
> ALWAYS assign 1 MASTER + 4 SLAVES onto the selected MASTER nodes (if I
> use $fill_up).
>
> To worse is that my application is not run optimally when splitting the
> parallell job round robin, so the only solution for me is to explicitly
> remove the PE from all master nodes and thereby loosing available
> resources.
>
> /Erik
>
> On ons, 2007-07-11 at 11:47 -0700, Daniel Templeton wrote:
>> Wrik,
>>
>> You may also want to read this post from Stephan's blog:
>>
>> http://blog.sun.com/sgrell/entry/n1ge_6_scheduler_hacks_seperated
>>
>> Daniel
>>
>> Lönroth Erik wrote:
>>> I am.
>>>
>>> I'm submitting this job to the queue.
>>> bash-3.00$ cat slot-allocation.job
>>> #!/bin/bash
>>> #$ -S /bin/bash
>>> #$ -N slot-allocation
>>> #$ -cwd
>>> #$ -o output.$JOB_ID
>>> #$ -e errors.$JOB_ID
>>>
>>> #$ -pe powerflow_*_pe  5
>>>
>>> #$ -masterq master.*.q
>>> echo "Starting on: ${HOSTNAME}"
>>> echo "$PE_HOSTFILE contains:"
>>> cat $PE_HOSTFILE
>>> sleep 30
>>>
>>>
>>>
>>>     215 0.55500 slot-alloc sssler       r     07/11/2007 17:19:24 master.101.q at ts101-1-0.sss.se. MASTER
>>>     215 0.55500 slot-alloc sssler       r     07/11/2007 17:19:24 short.101.q at ts101-1-0.sss.se.s SLAVE
>>>                                                                   short.101.q at ts101-1-0.sss.se.s SLAVE
>>>                                                                   short.101.q at ts101-1-0.sss.se.s SLAVE
>>>                                                                   short.101.q at ts101-1-0.sss.se.s SLAVE
>>>
>>>
>>> I also sometimes see that it does allocate a slot on a "MASTER" node even if slots are available on other machines, taking only 3/4 slots and putting 1 slot on a completely different host.
>>>
>>>
>>> ... like here for example where ts103-3-13 gets 3 slots filled, whereas it has 4 to offer. I would expect all 4 slots to be taken before the master.103.q at ts103-3-0 would be considered at all - since it has a higher sequence number. That doesn't seem to happen... *cry*
>>>
>>>  ( For you who has follow this thread, ts101 is a smaller test cluster we use for testing out queues and ts103+ts102 are partitions of a larger SGE_CELL )
>>>
>>>
>>>
>>>     977 0.55500 slot-alloc sssler       r     07/11/2007 18:04:54 short.103.q at ts103-3-12.sss.se. SLAVE
>>>                                                                   short.103.q at ts103-3-12.sss.se. SLAVE
>>>                                                                   short.103.q at ts103-3-12.sss.se. SLAVE
>>>                                                                   short.103.q at ts103-3-12.sss.se. SLAVE
>>>     977 0.55500 slot-alloc sssler       r     07/11/2007 18:04:54 short.103.q at ts103-3-13.sss.se. SLAVE
>>>                                                                   short.103.q at ts103-3-13.sss.se. SLAVE
>>>                                                                   short.103.q at ts103-3-13.sss.se. SLAVE
>>>     977 0.55500 slot-alloc sssler       r     07/11/2007 18:04:54 short.103.q at ts103-3-14.sss.se. SLAVE
>>>                                                                   short.103.q at ts103-3-14.sss.se. SLAVE
>>>                                                                   short.103.q at ts103-3-14.sss.se. SLAVE
>>>                                                                   short.103.q at ts103-3-14.sss.se. SLAVE
>>>     977 0.55500 slot-alloc sssler       r     07/11/2007 18:04:54 short.103.q at ts103-3-15.sss.se. SLAVE
>>>     977 0.55500 slot-alloc sssler       r     07/11/2007 18:04:54 master.103.q at ts103-3-0.sss.se. MASTER
>>>     977 0.55500 slot-alloc sssler       r     07/11/2007 18:04:54 master.103.q at ts103-3-1.sss.se. SLAVE
>>>
>>> I'm in pain. Arhhhh!
>>>
>>> /Erik
>>>
>>> -----Original Message-----
>>> From: Ravi Chandra Nallan [mailto:Ravichandra.Nallan at Sun.COM]
>>> Sent: Wed 7/11/2007 5:05 PM
>>> To: users at gridengine.sunsource.net
>>> Subject: Re: [GE users] Sort by sequence number question
>>>
>>> Can you try with some simple batch/array jobs
>>> eg. qsub -t 1-5 examples/jobs/sleeper.sh 10000
>>> and see which one gets filled first!
>>> regards,
>>> -Ravi
>>>
>>> Lönroth Erik wrote:
>>>
>>>>> Didn't seem to work.
>>>>>  qconf -sconf
>>>>>  qconf -ssconf
>>>>>  qconf -sq \*
>>>>>  qconf -se global
>>>>>
>>>>> Might be a better option.
>>>>> /mark
>>>>>
>>>>>
>>>> Here it goes:
>>>>
>>>> bash-3.00$ qconf -sconf
>>>> global:
>>>> execd_spool_dir              /opt/gridengine/narcissus/spool
>>>> mailer                       /opt/gridengine/scania/utils/mailing/mailer1.sh
>>>> xterm                        /usr/bin/X11/xterm
>>>> load_sensor                  /opt/gridengine/scania/utils/licensecheck.sh
>>>> prolog                       none
>>>> epilog                       none
>>>> shell_start_mode             posix_compliant
>>>> login_shells                 sh,ksh,csh,tcsh
>>>> min_uid                      0
>>>> min_gid                      0
>>>> user_lists                   none
>>>> xuser_lists                  none
>>>> projects                     none
>>>> xprojects                    none
>>>> enforce_project              false
>>>> enforce_user                 auto
>>>> load_report_time             00:00:40
>>>> max_unheard                  00:05:00
>>>> reschedule_unknown           00:00:00
>>>> loglevel                     log_warning
>>>> administrator_mail           erik.lonroth at scania.com
>>>> set_token_cmd                none
>>>> pag_cmd                      none
>>>> token_extend_time            none
>>>> shepherd_cmd                 none
>>>> qmaster_params               none
>>>> execd_params                 none
>>>> reporting_params             accounting=true reporting=false \
>>>>                              flush_time=00:00:15 joblog=false sharelog=00:00:00
>>>> finished_jobs                100
>>>> gid_range                    20000-20500
>>>> qlogin_command               /opt/gridengine/scania/utils/qlogin/qlogin.sh
>>>> qlogin_daemon                /usr/sbin/sshd -i
>>>> rlogin_daemon                /usr/sbin/sshd -i
>>>> max_aj_instances             0
>>>> max_aj_tasks                 0
>>>> max_u_jobs                   0
>>>> max_jobs                     0
>>>> auto_user_oticket            0
>>>> auto_user_fshare             0
>>>> auto_user_default_project    none
>>>> auto_user_delete_time        86400
>>>> delegated_file_staging       false
>>>> rsh_daemon                   /usr/sbin/sshd -i
>>>> rsh_command                  /usr/bin/ssh
>>>> rlogin_command               /usr/bin/ssh
>>>> reprioritize                 0
>>>>
>>>>
>>>>
>>>>
>>>> bash-3.00$   qconf -ssconf
>>>> algorithm                         default
>>>> schedule_interval                 0:0:15
>>>> maxujobs                          0
>>>> queue_sort_method                 seqno
>>>> job_load_adjustments              np_load_avg=0.50
>>>> load_adjustment_decay_time        0:7:30
>>>> load_formula                      np_load_avg
>>>> schedd_job_info                   true
>>>> flush_submit_sec                  0
>>>> flush_finish_sec                  0
>>>> params                            none
>>>> reprioritize_interval             0:0:0
>>>> halftime                          168
>>>> usage_weight_list                 cpu=1.000000,mem=0.000000,io=0.000000
>>>> compensation_factor               5.000000
>>>> weight_user                       0.250000
>>>> weight_project                    0.250000
>>>> weight_department                 0.250000
>>>> weight_job                        0.250000
>>>> weight_tickets_functional         0
>>>> weight_tickets_share              0
>>>> share_override_tickets            TRUE
>>>> share_functional_shares           TRUE
>>>> max_functional_jobs_to_schedule   200
>>>> report_pjob_tickets               TRUE
>>>> max_pending_tasks_per_job         50
>>>> halflife_decay_list               none
>>>> policy_hierarchy                  OFS
>>>> weight_ticket                     0.010000
>>>> weight_waiting_time               0.000000
>>>> weight_deadline                   3600000.000000
>>>> weight_urgency                    0.100000
>>>> weight_priority                   1.000000
>>>> max_reservation                   0
>>>> default_duration                  0:10:0
>>>>
>>>>
>>>>
>>>> bash-3.00$   qconf -sq \*
>>>> qname                 master.101.q
>>>> hostlist              ts101-1-0.sss.se.scania.com
>>>> seq_no                0
>>>> load_thresholds       np_load_avg=1.75
>>>> suspend_thresholds    NONE
>>>> nsuspend              1
>>>> suspend_interval      00:05:00
>>>> priority              0
>>>> min_cpu_interval      00:05:00
>>>> processors            UNDEFINED
>>>> qtype                 BATCH INTERACTIVE
>>>> ckpt_list             NONE
>>>> pe_list               dummy_ts101_pe fire_101_pe fluent_ts101_pe make \
>>>>                       mpich_ts101_pe powerflow_ts101_pe
>>>> rerun                 FALSE
>>>> slots                 1
>>>> tmpdir                /tmp
>>>> shell                 /bin/csh
>>>> prolog                NONE
>>>> epilog                NONE
>>>> shell_start_mode      posix_compliant
>>>> starter_method        NONE
>>>> suspend_method        NONE
>>>> resume_method         NONE
>>>> terminate_method      NONE
>>>> notify                00:00:60
>>>> owner_list            NONE
>>>> user_lists            NONE
>>>> xuser_lists           NONE
>>>> subordinate_list      NONE
>>>> complex_values        NONE
>>>> projects              NONE
>>>> xprojects             NONE
>>>> calendar              NONE
>>>> initial_state         default
>>>> s_rt                  INFINITY
>>>> h_rt                  INFINITY
>>>> s_cpu                 INFINITY
>>>> h_cpu                 INFINITY
>>>> s_fsize               INFINITY
>>>> h_fsize               INFINITY
>>>> s_data                INFINITY
>>>> h_data                INFINITY
>>>> s_stack               INFINITY
>>>> h_stack               INFINITY
>>>> s_core                INFINITY
>>>> h_core                INFINITY
>>>> s_rss                 INFINITY
>>>> h_rss                 INFINITY
>>>> s_vmem                INFINITY
>>>> h_vmem                INFINITY
>>>>
>>>>
>>>>
>>>> qname                 short.101.q
>>>> hostlist              @ts101_X_hg
>>>> seq_no                101,[ts101-1-0.sss.se.scania.com=9999]
>>>> load_thresholds       np_load_avg=1.75
>>>> suspend_thresholds    NONE
>>>> nsuspend              1
>>>> suspend_interval      00:05:00
>>>> priority              0
>>>> min_cpu_interval      00:05:00
>>>> processors            UNDEFINED
>>>> qtype                 BATCH INTERACTIVE
>>>> ckpt_list             NONE
>>>> pe_list               dummy_ts101_pe fire_101_pe fluent_ts101_pe make \
>>>>                       mpich_ts101_pe powerflow_ts101_pe
>>>> rerun                 FALSE
>>>> slots                 4
>>>> tmpdir                /tmp
>>>> shell                 /bin/csh
>>>> prolog                NONE
>>>> epilog                NONE
>>>> shell_start_mode      posix_compliant
>>>> starter_method        NONE
>>>> suspend_method        NONE
>>>> resume_method         NONE
>>>> terminate_method      NONE
>>>> notify                00:00:60
>>>> owner_list            NONE
>>>> user_lists            NONE
>>>> xuser_lists           NONE
>>>> subordinate_list      NONE
>>>> complex_values        NONE
>>>> projects              NONE
>>>> xprojects             NONE
>>>> calendar              NONE
>>>> initial_state         default
>>>> s_rt                  INFINITY
>>>> h_rt                  INFINITY
>>>> s_cpu                 INFINITY
>>>> h_cpu                 INFINITY
>>>> s_fsize               INFINITY
>>>> h_fsize               INFINITY
>>>> s_data                INFINITY
>>>> h_data                INFINITY
>>>> s_stack               INFINITY
>>>> h_stack               INFINITY
>>>> s_core                INFINITY
>>>> h_core                INFINITY
>>>> s_rss                 INFINITY
>>>> h_rss                 INFINITY
>>>> s_vmem                INFINITY
>>>> h_vmem                INFINITY
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> bash-3.00$ qconf -se global
>>>> hostname              global
>>>> load_scaling          NONE
>>>> complex_values        fluent_all=10,fluent_par=48,gtpowerx=7,dyna=18
>>>> load_values           dyna=2,fluent_all=8,fluent_par=41,gtpowerx=2
>>>> processors            0
>>>> user_lists            NONE
>>>> xuser_lists           NONE
>>>> projects              NONE
>>>> xprojects             NONE
>>>> usage_scaling         NONE
>>>> report_variables      NONE
>>>>
>>>>
>>>> ------------------------------------------------------------------------
>>>>
>>>> ---------------------------------------------------------------------
>>>> To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
>>>> For additional commands, e-mail: users-help at gridengine.sunsource.net
>>>>
>>>>
>>>
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
>>> For additional commands, e-mail: users-help at gridengine.sunsource.net
>>>
>>>
>>>
>>>
>>> ------------------------------------------------------------------------
>>>
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
>>> For additional commands, e-mail: users-help at gridengine.sunsource.net
>>>
>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
>> For additional commands, e-mail: users-help at gridengine.sunsource.net
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
> For additional commands, e-mail: users-help at gridengine.sunsource.net
>
>

http://gridengine.info/

Sitz der Gesellschaft: Sun Microsystems GmbH, Sonnenallee 1, D-85551 Kirchheim-Heimstetten
Amtsgericht Muenchen: HRB 161028
Geschaeftsfuehrer: Marcel Schneider, Wolfgang Engels, Dr. Roland Boemer
Vorsitzender des Aufsichtsrates: Martin Haering



    [ Part 2: "Attached Text" ]

    [ The following text is in the "iso-8859-1" character set. ]
    [ Your display is set for the "ISO-8859-10" character set.  ]
    [ Some special characters may be displayed incorrectly. ]

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
For additional commands, e-mail: users-help at gridengine.sunsource.net



More information about the gridengine-users mailing list