[GE users] mpich <-> sge --> controlling hosts machinefile

Andreas.Haas at Sun.COM Andreas.Haas at Sun.COM
Thu Jul 5 10:49:44 BST 2007


    [ The following text is in the "ISO-8859-1" character set. ]
    [ Your display is set for the "ISO-8859-10" character set.  ]
    [ Some special characters may be displayed incorrectly. ]

Hi Gerolf,

you should really use 2 as allocation_rule, since you always want to have 2 
tasks on a node.

For preventing overloaded nodes be selected load thresholds although can be used, 
yet this requires you to fiddle around with time dependent things like load 
correction. If this load of 1 comes from other batch load that
is known by Grid Engine you should consider use of

    complex_values slots=2

in each execution host configuration or, if you run 6.1 you could also set-up 
a resource quota such as

    limit hosts {@mpich_hosts} to slots=2

the benefit of deploying the slot consumable instead of load threshold is
that your configuration becomes more deterministic.

Regards,
Andreas


On Wed, 4 Jul 2007, Gerolf Ziegenhain wrote:

> Hi Chris,
>
> The result of this experiment is predictable: If I remove the n_slots
> threshold, the SGE will also start single slots on nodes which already have
> a load of 1. This means it can only start 1 process; however if the
> allocation of slots is fixed with "2" nothing will happen: behaviour as
> described before.
>
> /BR: Gerolf
>
> 2007/7/4, Chris Dagdigian <dag at sonsorol.org>:
>> 
>> 
>> Hi Gerolf,
>> 
>> I run MPI apps on 2-way compute nodes all the time without problems
>> and I've always just let "slots 2" stay in the queue configuration.
>> The SGE scheduler always did the right thing. Can you remove your
>> n_slots load thresholds and queue consumables and see what happens?
>> I'm still curious as to why you had three tasks land on one node.
>> 
>> 
>> 
>> 
>> 
>> On Jul 4, 2007, at 3:35 PM, Gerolf Ziegenhain wrote:
>> 
>> > Thanks for the very quick reply ;)
>> >
>> > allocation_rule = $round_robin results in 1job/node. This increases
>> > the communication effort. So maybe allocation_rule=2 would be the
>> > best choice in my case?
>> >
>> > This is the configuration of the queue:
>> > qconf -sq q_mpich
>> > qname                 q_mpich
>> > hostlist              lc10 lc11 lc12 lc13 lc14 lc15 lc18 lc19
>> > seq_no                21,[@b_hosts=22],[@x_hosts=23]
>> > load_thresholds       np_load_avg=1,np_load_short=1,n_slots=2, \
>> >
>> > [@b_hosts=np_load_avg=1,np_load_short=1,n_slots=2], \
>> >
>> > [@x_hosts=np_load_avg=1,np_load_short=1,n_slots=2]
>> > suspend_thresholds    NONE
>> > nsuspend              1
>> > suspend_interval      00:05:00
>> > priority              0
>> > min_cpu_interval      00:05:00
>> > processors            UNDEFINED
>> > qtype                 BATCH
>> > ckpt_list             NONE
>> > pe_list               mpich
>> > rerun                 TRUE
>> > slots                 2
>> > tmpdir                /tmp
>> > shell                 /bin/bash
>> > prolog                NONE
>> > epilog                NONE
>> > shell_start_mode      unix_behavior
>> > starter_method        NONE
>> > suspend_method        NONE
>> > resume_method         NONE
>> > terminate_method      NONE
>> > notify                00:00:60
>> > owner_list            NONE
>> > user_lists            ziegen,[@x_hosts=big]
>> > xuser_lists           matlab matlab1 thor
>> > subordinate_list      NONE
>> > complex_values        synchron=0,virtual_free=3G,n_slots=2, \
>> >
>> > [@b_hosts=synchron=0,virtual_free=5G,n_slots=2], \
>> >                       [@x_hosts=synchron=0,virtual_free=17G,n_slots=2]
>> > projects              NONE
>> > xprojects             NONE
>> > calendar              NONE
>> > initial_state         default
>> > s_rt                  INFINITY
>> > h_rt                  INFINITY
>> > s_cpu                 INFINITY
>> > h_cpu                 100:00:00
>> > s_fsize               INFINITY
>> > h_fsize               INFINITY
>> > s_data                INFINITY
>> > h_data                2G,[@b_hosts=4G],[@x_hosts=16G]
>> > s_stack               INFINITY
>> > h_stack               INFINITY
>> > s_core                INFINITY
>> > h_core                INFINITY
>> > s_rss                 INFINITY
>> > h_rss                 INFINITY
>> > s_vmem                INFINITY
>> > h_vmem                3G,[@b_hosts=5G],[@x_hosts=17G]
>> >
>> >
>> > /BR:
>> >    Gerolf
>> >
>> >
>> > 2007/7/4, Chris Dagdigian <dag at sonsorol.org>:
>> > Not sure if this totally answers your question but you can play with
>> > the host selection process by adjusting your $allocation_rule in your
>> > parallel environment configuration.
>> >
>> > For instance, you have $fill_up configured which is why your parallel
>> > slots are being packed on as few nodes as possible. Changing to
>> > $round_robin will spread it out among as many machines as possible.
>> >
>> > For your main symptom:
>> >
>> > If your parallel jobs are running more than 2 tasks per node then
>> > something may be off with your slot count - perhaps SGE is detecting
>> > multi-core CPUs on your 2-way boxes and setting slots=4 on each node.
>> > Posting the config of the queue "mpich-qeueue" may help get to the
>> > bottom of this as I'm not sure about the n_slots "limit" you are
>> > referring to.
>> >
>> >
>> > Regards,
>> > Chris
>> >
>> >
>> >
>> > On Jul 4, 2007, at 3:14 PM, Gerolf Ziegenhain wrote:
>> >
>> > > Hi,
>> > >
>> > > Maybe it is a very stupid question, but: How do I control the
>> > > number of jobs per node? Consider the following hardware: 38 nodes
>> > > with two processors on each. When I start a job with -pe mpich 8
>> > > there should be 4 nodes used with 2 jobs on each. What do I have to
>> > > do in order to achieve this?
>> > >
>> > > My parallel environment is configured like this:
>> > > qconf -sp mpich
>> > > pe_name           mpich
>> > > slots             60
>> > > user_lists        NONE
>> > > xuser_lists       NONE
>> > > start_proc_args   /opt/N1GE/mpi/startmpi.sh -catch_rsh $pe_hostfile
>> > > stop_proc_args    /opt/N1GE/mpi/stopmpi.sh
>> > > allocation_rule   $fill_up
>> > > control_slaves    TRUE
>> > > job_is_first_task FALSE
>> > > urgency_slots     min
>> > >
>> > > My mpich-queue has limits:
>> > > np_load_av=1
>> > > np_load_sh=1
>> > > n_slots=2
>> > >
>> > > However if I start a job, something like this will happen in the
>> > > PI1234-file:
>> > > lc12.rhrk.uni-kl.de 0 prog
>> > > lc19 1 prog
>> > > lc19 1 prog
>> > > lc19 1 prog
>> > > lc14 1 prog
>> > > lc14 1 prog
>> > > lc13 1 prog
>> > > lc13 1 prog
>> > >
>> > > So there are particularly three jobs on lc19 with only two CPUs, On
>> > > of these three jobs would better be running on lc12. How can I fix
>> > > this?
>> > >
>> > >
>> > > Thanks in advance:
>> > >    Gerolf
>> > >
>> > >
>> > >
>> > >
>> > > --
>> > > Dipl. Phys. Gerolf Ziegenhain
>> > > Office: Room 46-332 - Erwin-Schrödinger-Str.46 - TU Kaiserslautern
>> > > - Germany
>> > > Web: gerolf.ziegenhain.com
>> > >
>> > >
>> >
>> > ---------------------------------------------------------------------
>> > To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
>> > For additional commands, e-mail: users-help at gridengine.sunsource.net
>> >
>> >
>> >
>> >
>> > --
>> > Dipl. Phys. Gerolf Ziegenhain
>> > Office: Room 46-332 - Erwin-Schrödinger-Str.46 - TU Kaiserslautern
>> > - Germany
>> > Web: gerolf.ziegenhain.com
>> >
>> 
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
>> For additional commands, e-mail: users-help at gridengine.sunsource.net
>> 
>> 
>
>
> -- 
> Dipl. Phys. Gerolf Ziegenhain
> Office: Room 46-332 - Erwin-Schrödinger-Str.46 - TU Kaiserslautern - Germany
> Web: gerolf.ziegenhain.com
>

http://gridengine.info/

Sitz der Gesellschaft: Sun Microsystems GmbH, Sonnenallee 1, D-85551 Kirchheim-Heimstetten
Amtsgericht Muenchen: HRB 161028
Geschaeftsfuehrer: Marcel Schneider, Wolfgang Engels, Dr. Roland Boemer
Vorsitzender des Aufsichtsrates: Martin Haering



    [ Part 2: "Attached Text" ]

    [ The following text is in the "iso-8859-1" character set. ]
    [ Your display is set for the "ISO-8859-10" character set.  ]
    [ Some special characters may be displayed incorrectly. ]

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
For additional commands, e-mail: users-help at gridengine.sunsource.net



More information about the gridengine-users mailing list