[GE users] mpich <-> sge --> controlling hosts machinefile

Ravi Chandra Nallan Ravichandra.Nallan at Sun.COM
Thu Jul 5 08:08:47 BST 2007


    [ The following text is in the "UTF-8" character set. ]
    [ Your display is set for the "ISO-8859-10" character set.  ]
    [ Some characters may be displayed incorrectly. ]

Hi Gerolf,
you must have created the complex n_slots, which is a consumable. But it 
seems redundant. sge maintains a built in complex 'slots', which does 
the same in your case (both slots and n_slots is 2). you can see the no. 
of slots assigned to the qinstance(s) using qstat -g c
I am not sure how 3 tasks are scheduled on a node, can you please check 
with qstat -g t , just to make sure we are seeing the right thing?

regards,
-Ravi

Gerolf Ziegenhain wrote:
> Hi Chris,
>
> The result of this experiment is predictable: If I remove the n_slots 
> threshold, the SGE will also start single slots on nodes which already 
> have a load of 1. This means it can only start 1 process; however if 
> the allocation of slots is fixed with "2" nothing will happen: 
> behaviour as described before.
>
> /BR: Gerolf
>
> 2007/7/4, Chris Dagdigian <dag at sonsorol.org <mailto:dag at sonsorol.org>>:
>
>
>     Hi Gerolf,
>
>     I run MPI apps on 2-way compute nodes all the time without problems
>     and I've always just let "slots 2" stay in the queue configuration.
>     The SGE scheduler always did the right thing. Can you remove your
>     n_slots load thresholds and queue consumables and see what happens?
>     I'm still curious as to why you had three tasks land on one node.
>
>
>
>
>
>     On Jul 4, 2007, at 3:35 PM, Gerolf Ziegenhain wrote:
>
>     > Thanks for the very quick reply ;)
>     >
>     > allocation_rule = $round_robin results in 1job/node. This increases
>     > the communication effort. So maybe allocation_rule=2 would be the
>     > best choice in my case?
>     >
>     > This is the configuration of the queue:
>     > qconf -sq q_mpich
>     > qname                 q_mpich
>     > hostlist              lc10 lc11 lc12 lc13 lc14 lc15 lc18 lc19
>     > seq_no                21,[@b_hosts=22],[@x_hosts=23]
>     > load_thresholds       np_load_avg=1,np_load_short=1,n_slots=2, \
>     >
>     > [@b_hosts=np_load_avg=1,np_load_short=1,n_slots=2], \
>     >
>     > [@x_hosts=np_load_avg=1,np_load_short=1,n_slots=2]
>     > suspend_thresholds    NONE
>     > nsuspend              1
>     > suspend_interval      00:05:00
>     > priority              0
>     > min_cpu_interval      00:05:00
>     > processors            UNDEFINED
>     > qtype                 BATCH
>     > ckpt_list             NONE
>     > pe_list               mpich
>     > rerun                 TRUE
>     > slots                 2
>     > tmpdir                /tmp
>     > shell                 /bin/bash
>     > prolog                NONE
>     > epilog                NONE
>     > shell_start_mode      unix_behavior
>     > starter_method        NONE
>     > suspend_method        NONE
>     > resume_method         NONE
>     > terminate_method      NONE
>     > notify                00:00:60
>     > owner_list            NONE
>     > user_lists            ziegen,[@x_hosts=big]
>     > xuser_lists           matlab matlab1 thor
>     > subordinate_list      NONE
>     > complex_values        synchron=0,virtual_free=3G,n_slots=2, \
>     >
>     > [@b_hosts=synchron=0,virtual_free=5G,n_slots=2], \
>     >                      
>     [@x_hosts=synchron=0,virtual_free=17G,n_slots=2]
>     > projects              NONE
>     > xprojects             NONE
>     > calendar              NONE
>     > initial_state         default
>     > s_rt                  INFINITY
>     > h_rt                  INFINITY
>     > s_cpu                 INFINITY
>     > h_cpu                 100:00:00
>     > s_fsize               INFINITY
>     > h_fsize               INFINITY
>     > s_data                INFINITY
>     > h_data                2G,[@b_hosts=4G],[@x_hosts=16G]
>     > s_stack               INFINITY
>     > h_stack               INFINITY
>     > s_core                INFINITY
>     > h_core                INFINITY
>     > s_rss                 INFINITY
>     > h_rss                 INFINITY
>     > s_vmem                INFINITY
>     > h_vmem                3G,[@b_hosts=5G],[@x_hosts=17G]
>     >
>     >
>     > /BR:
>     >    Gerolf
>     >
>     >
>     > 2007/7/4, Chris Dagdigian <dag at sonsorol.org
>     <mailto:dag at sonsorol.org>>:
>     > Not sure if this totally answers your question but you can play
>     with
>     > the host selection process by adjusting your $allocation_rule in
>     your
>     > parallel environment configuration.
>     >
>     > For instance, you have $fill_up configured which is why your
>     parallel
>     > slots are being packed on as few nodes as possible. Changing to
>     > $round_robin will spread it out among as many machines as possible.
>     >
>     > For your main symptom:
>     >
>     > If your parallel jobs are running more than 2 tasks per node then
>     > something may be off with your slot count - perhaps SGE is
>     detecting
>     > multi-core CPUs on your 2-way boxes and setting slots=4 on each
>     node.
>     > Posting the config of the queue "mpich-qeueue" may help get to the
>     > bottom of this as I'm not sure about the n_slots "limit" you are
>     > referring to.
>     >
>     >
>     > Regards,
>     > Chris
>     >
>     >
>     >
>     > On Jul 4, 2007, at 3:14 PM, Gerolf Ziegenhain wrote:
>     >
>     > > Hi,
>     > >
>     > > Maybe it is a very stupid question, but: How do I control the
>     > > number of jobs per node? Consider the following hardware: 38 nodes
>     > > with two processors on each. When I start a job with -pe mpich 8
>     > > there should be 4 nodes used with 2 jobs on each. What do I
>     have to
>     > > do in order to achieve this?
>     > >
>     > > My parallel environment is configured like this:
>     > > qconf -sp mpich
>     > > pe_name           mpich
>     > > slots             60
>     > > user_lists        NONE
>     > > xuser_lists       NONE
>     > > start_proc_args   /opt/N1GE/mpi/startmpi.sh -catch_rsh
>     $pe_hostfile
>     > > stop_proc_args    /opt/N1GE/mpi/stopmpi.sh
>     > > allocation_rule   $fill_up
>     > > control_slaves    TRUE
>     > > job_is_first_task FALSE
>     > > urgency_slots     min
>     > >
>     > > My mpich-queue has limits:
>     > > np_load_av=1
>     > > np_load_sh=1
>     > > n_slots=2
>     > >
>     > > However if I start a job, something like this will happen in the
>     > > PI1234-file:
>     > > lc12.rhrk.uni-kl.de <http://lc12.rhrk.uni-kl.de> 0 prog
>     > > lc19 1 prog
>     > > lc19 1 prog
>     > > lc19 1 prog
>     > > lc14 1 prog
>     > > lc14 1 prog
>     > > lc13 1 prog
>     > > lc13 1 prog
>     > >
>     > > So there are particularly three jobs on lc19 with only two
>     CPUs, On
>     > > of these three jobs would better be running on lc12. How can I fix
>     > > this?
>     > >
>     > >
>     > > Thanks in advance:
>     > >    Gerolf
>     > >
>     > >
>     > >
>     > >
>     > > --
>     > > Dipl. Phys. Gerolf Ziegenhain
>     > > Office: Room 46-332 - Erwin-Schrödinger-Str.46 - TU Kaiserslautern
>     > > - Germany
>     > > Web: gerolf.ziegenhain.com <http://gerolf.ziegenhain.com>
>     > >
>     > >
>     >
>     >
>     ---------------------------------------------------------------------
>     > To unsubscribe, e-mail:
>     users-unsubscribe at gridengine.sunsource.net
>     <mailto:users-unsubscribe at gridengine.sunsource.net>
>     > For additional commands, e-mail:
>     users-help at gridengine.sunsource.net
>     <mailto:users-help at gridengine.sunsource.net>
>     >
>     >
>     >
>     >
>     > --
>     > Dipl. Phys. Gerolf Ziegenhain
>     > Office: Room 46-332 - Erwin-Schrödinger-Str.46 - TU Kaiserslautern
>     > - Germany
>     > Web: gerolf.ziegenhain.com <http://gerolf.ziegenhain.com>
>     >
>
>     ---------------------------------------------------------------------
>     To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
>     <mailto:users-unsubscribe at gridengine.sunsource.net>
>     For additional commands, e-mail:
>     users-help at gridengine.sunsource.net
>     <mailto:users-help at gridengine.sunsource.net>
>
>
>
>
> -- 
> Dipl. Phys. Gerolf Ziegenhain
> Office: Room 46-332 - Erwin-Schrödinger-Str.46 - TU Kaiserslautern - 
> Germany
> Web: gerolf.ziegenhain.com <http://gerolf.ziegenhain.com> 

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
For additional commands, e-mail: users-help at gridengine.sunsource.net




More information about the gridengine-users mailing list