[GE users] mpich <-> sge --> controlling hosts machinefile

Gerolf Ziegenhain gerolf.ziegenhain at googlemail.com
Thu Jul 5 10:54:12 BST 2007


    [ The following text is in the "UTF-8" character set. ]
    [ Your display is set for the "ISO-8859-10" character set.  ]
    [ Some characters may be displayed incorrectly. ]

Hi Andreas,

allocation_rule makes sense. Is n_slots=2 the same as slots=2?

/BR:
   Gerolf

2007/7/5, Andreas.Haas at sun.com <Andreas.Haas at sun.com>:
>
> Hi Gerolf,
>
> you should really use 2 as allocation_rule, since you always want to have
> 2
> tasks on a node.
>
> For preventing overloaded nodes be selected load thresholds although can
> be used,
> yet this requires you to fiddle around with time dependent things like
> load
> correction. If this load of 1 comes from other batch load that
> is known by Grid Engine you should consider use of
>
>     complex_values slots=2
>
> in each execution host configuration or, if you run 6.1 you could also
> set-up
> a resource quota such as
>
>     limit hosts {@mpich_hosts} to slots=2
>
> the benefit of deploying the slot consumable instead of load threshold is
> that your configuration becomes more deterministic.
>
> Regards,
> Andreas
>
>
> On Wed, 4 Jul 2007, Gerolf Ziegenhain wrote:
>
> > Hi Chris,
> >
> > The result of this experiment is predictable: If I remove the n_slots
> > threshold, the SGE will also start single slots on nodes which already
> have
> > a load of 1. This means it can only start 1 process; however if the
> > allocation of slots is fixed with "2" nothing will happen: behaviour as
> > described before.
> >
> > /BR: Gerolf
> >
> > 2007/7/4, Chris Dagdigian <dag at sonsorol.org>:
> >>
> >>
> >> Hi Gerolf,
> >>
> >> I run MPI apps on 2-way compute nodes all the time without problems
> >> and I've always just let "slots 2" stay in the queue configuration.
> >> The SGE scheduler always did the right thing. Can you remove your
> >> n_slots load thresholds and queue consumables and see what happens?
> >> I'm still curious as to why you had three tasks land on one node.
> >>
> >>
> >>
> >>
> >>
> >> On Jul 4, 2007, at 3:35 PM, Gerolf Ziegenhain wrote:
> >>
> >> > Thanks for the very quick reply ;)
> >> >
> >> > allocation_rule = $round_robin results in 1job/node. This increases
> >> > the communication effort. So maybe allocation_rule=2 would be the
> >> > best choice in my case?
> >> >
> >> > This is the configuration of the queue:
> >> > qconf -sq q_mpich
> >> > qname                 q_mpich
> >> > hostlist              lc10 lc11 lc12 lc13 lc14 lc15 lc18 lc19
> >> > seq_no                21,[@b_hosts=22],[@x_hosts=23]
> >> > load_thresholds       np_load_avg=1,np_load_short=1,n_slots=2, \
> >> >
> >> > [@b_hosts=np_load_avg=1,np_load_short=1,n_slots=2], \
> >> >
> >> > [@x_hosts=np_load_avg=1,np_load_short=1,n_slots=2]
> >> > suspend_thresholds    NONE
> >> > nsuspend              1
> >> > suspend_interval      00:05:00
> >> > priority              0
> >> > min_cpu_interval      00:05:00
> >> > processors            UNDEFINED
> >> > qtype                 BATCH
> >> > ckpt_list             NONE
> >> > pe_list               mpich
> >> > rerun                 TRUE
> >> > slots                 2
> >> > tmpdir                /tmp
> >> > shell                 /bin/bash
> >> > prolog                NONE
> >> > epilog                NONE
> >> > shell_start_mode      unix_behavior
> >> > starter_method        NONE
> >> > suspend_method        NONE
> >> > resume_method         NONE
> >> > terminate_method      NONE
> >> > notify                00:00:60
> >> > owner_list            NONE
> >> > user_lists            ziegen,[@x_hosts=big]
> >> > xuser_lists           matlab matlab1 thor
> >> > subordinate_list      NONE
> >> > complex_values        synchron=0,virtual_free=3G,n_slots=2, \
> >> >
> >> > [@b_hosts=synchron=0,virtual_free=5G,n_slots=2], \
> >> >
> [@x_hosts=synchron=0,virtual_free=17G,n_slots=2]
> >> > projects              NONE
> >> > xprojects             NONE
> >> > calendar              NONE
> >> > initial_state         default
> >> > s_rt                  INFINITY
> >> > h_rt                  INFINITY
> >> > s_cpu                 INFINITY
> >> > h_cpu                 100:00:00
> >> > s_fsize               INFINITY
> >> > h_fsize               INFINITY
> >> > s_data                INFINITY
> >> > h_data                2G,[@b_hosts=4G],[@x_hosts=16G]
> >> > s_stack               INFINITY
> >> > h_stack               INFINITY
> >> > s_core                INFINITY
> >> > h_core                INFINITY
> >> > s_rss                 INFINITY
> >> > h_rss                 INFINITY
> >> > s_vmem                INFINITY
> >> > h_vmem                3G,[@b_hosts=5G],[@x_hosts=17G]
> >> >
> >> >
> >> > /BR:
> >> >    Gerolf
> >> >
> >> >
> >> > 2007/7/4, Chris Dagdigian <dag at sonsorol.org>:
> >> > Not sure if this totally answers your question but you can play with
> >> > the host selection process by adjusting your $allocation_rule in your
> >> > parallel environment configuration.
> >> >
> >> > For instance, you have $fill_up configured which is why your parallel
> >> > slots are being packed on as few nodes as possible. Changing to
> >> > $round_robin will spread it out among as many machines as possible.
> >> >
> >> > For your main symptom:
> >> >
> >> > If your parallel jobs are running more than 2 tasks per node then
> >> > something may be off with your slot count - perhaps SGE is detecting
> >> > multi-core CPUs on your 2-way boxes and setting slots=4 on each node.
> >> > Posting the config of the queue "mpich-qeueue" may help get to the
> >> > bottom of this as I'm not sure about the n_slots "limit" you are
> >> > referring to.
> >> >
> >> >
> >> > Regards,
> >> > Chris
> >> >
> >> >
> >> >
> >> > On Jul 4, 2007, at 3:14 PM, Gerolf Ziegenhain wrote:
> >> >
> >> > > Hi,
> >> > >
> >> > > Maybe it is a very stupid question, but: How do I control the
> >> > > number of jobs per node? Consider the following hardware: 38 nodes
> >> > > with two processors on each. When I start a job with -pe mpich 8
> >> > > there should be 4 nodes used with 2 jobs on each. What do I have to
> >> > > do in order to achieve this?
> >> > >
> >> > > My parallel environment is configured like this:
> >> > > qconf -sp mpich
> >> > > pe_name           mpich
> >> > > slots             60
> >> > > user_lists        NONE
> >> > > xuser_lists       NONE
> >> > > start_proc_args   /opt/N1GE/mpi/startmpi.sh -catch_rsh $pe_hostfile
> >> > > stop_proc_args    /opt/N1GE/mpi/stopmpi.sh
> >> > > allocation_rule   $fill_up
> >> > > control_slaves    TRUE
> >> > > job_is_first_task FALSE
> >> > > urgency_slots     min
> >> > >
> >> > > My mpich-queue has limits:
> >> > > np_load_av=1
> >> > > np_load_sh=1
> >> > > n_slots=2
> >> > >
> >> > > However if I start a job, something like this will happen in the
> >> > > PI1234-file:
> >> > > lc12.rhrk.uni-kl.de 0 prog
> >> > > lc19 1 prog
> >> > > lc19 1 prog
> >> > > lc19 1 prog
> >> > > lc14 1 prog
> >> > > lc14 1 prog
> >> > > lc13 1 prog
> >> > > lc13 1 prog
> >> > >
> >> > > So there are particularly three jobs on lc19 with only two CPUs, On
> >> > > of these three jobs would better be running on lc12. How can I fix
> >> > > this?
> >> > >
> >> > >
> >> > > Thanks in advance:
> >> > >    Gerolf
> >> > >
> >> > >
> >> > >
> >> > >
> >> > > --
> >> > > Dipl. Phys. Gerolf Ziegenhain
> >> > > Office: Room 46-332 - Erwin-Schrödinger-Str.46 - TU Kaiserslautern
> >> > > - Germany
> >> > > Web: gerolf.ziegenhain.com
> >> > >
> >> > >
> >> >
> >> > ---------------------------------------------------------------------
> >> > To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
> >> > For additional commands, e-mail: users-help at gridengine.sunsource.net
> >> >
> >> >
> >> >
> >> >
> >> > --
> >> > Dipl. Phys. Gerolf Ziegenhain
> >> > Office: Room 46-332 - Erwin-Schrödinger-Str.46 - TU Kaiserslautern
> >> > - Germany
> >> > Web: gerolf.ziegenhain.com
> >> >
> >>
> >> ---------------------------------------------------------------------
> >> To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
> >> For additional commands, e-mail: users-help at gridengine.sunsource.net
> >>
> >>
> >
> >
> > --
> > Dipl. Phys. Gerolf Ziegenhain
> > Office: Room 46-332 - Erwin-Schrödinger-Str.46 - TU Kaiserslautern -
> Germany
> > Web: gerolf.ziegenhain.com
> >
>
> http://gridengine.info/
>
> Sitz der Gesellschaft: Sun Microsystems GmbH, Sonnenallee 1, D-85551
> Kirchheim-Heimstetten
> Amtsgericht Muenchen: HRB 161028
> Geschaeftsfuehrer: Marcel Schneider, Wolfgang Engels, Dr. Roland Boemer
> Vorsitzender des Aufsichtsrates: Martin Haering
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
> For additional commands, e-mail: users-help at gridengine.sunsource.net
>



-- 
Dipl. Phys. Gerolf Ziegenhain
Office: Room 46-332 - Erwin-Schrödinger-Str.46 - TU Kaiserslautern - Germany
Web: gerolf.ziegenhain.com



More information about the gridengine-users mailing list