[GE users] Loose vs. tight MPICH integration

Reuti reuti at staff.uni-marburg.de
Fri Apr 21 06:31:57 BST 2006


Am 21.04.2006 um 04:04 schrieb Duong Ta:

> Dear Reuti,
>
> This is my loose PE definition:
>
> $ qconf -sp mpi
>
> pe_name           mpi
> slots             4
> user_lists        NONE
> xuser_lists       NONE
> start_proc_args   /usr/local/sge6/mpi/startmpi.sh $pe_hostfile
> stop_proc_args    /usr/local/sge6/mpi/stopmpi.sh
> allocation_rule   $round_robin
> control_slaves    FALSE
> job_is_first_task TRUE
> urgency_slots     min
>
> The cluster has 3 nodes: head node genesis, 2 execution nodes  
> viz001 and viz002.
> This is my ps output:
>
> - On viz002:
>
> 29081     1 29081 /usr/local/sge6/bin/lx24-amd64/sge_execd
> 19554 29081 19554  \_ sge_shepherd-82 -bg
> 19578 19554 19578      \_ -sh /usr/local/sge6/default/spool/viz002/ 
> job_scripts/82 cpi
> 19631 19578 19578          \_ /bin/sh /usr/local/mpich-1.2.7p1/bin/ 
> mpirun -np 4 -machinefile /tmp/82.1.all.q/machines cp
> 19757 19631 19757              \_ /home2/snc/griduser/duong/cpi - 
> p4pg /home2/snc/griduser/duong/PI19631 -p4wd /home2/snc
> 19758 19757 19757                  \_ /home2/snc/griduser/duong/cpi  
> -p4pg /home2/snc/griduser/duong/PI19631 -p4wd /home2
> 19759 19757 19757                  \_ ssh viz002 -l griduser -n / 
> home2/snc/griduser/duong/cpi viz002 43827 \-p4amslave \
> 19780 19757 19757                  \_ ssh viz001 -l griduser -n / 
> home2/snc/griduser/duong/cpi viz002 43827 \-p4amslave \
> 19782 19757 19757                  \_ ssh viz001 -l griduser -n / 
> home2/snc/griduser/duong/cpi viz002 43827 \-p4amslave \
>
> - On viz001:
>
> 24281  2493 24281  \_ sshd: griduser [priv]
> 24283 24281 24281  |   \_ sshd: griduser at notty
> 24284 24283 24284  |       \_ /home2/snc/griduser/duong/cpi viz002  
> 43827   4amslave -p4yourname viz001 -p4rmrank 2
> 24301 24284 24284  |           \_ /home2/snc/griduser/duong/cpi  
> viz002 43827   4amslave -p4yourname viz001 -p4rmrank 2
> 24302  2493 24302  \_ sshd: griduser [priv]
> 24304 24302 24302      \_ sshd: griduser at notty
> 24305 24304 24305          \_ /home2/snc/griduser/duong/cpi viz002  
> 43827   4amslave -p4yourname viz001 -p4rmrank 3
> 24322 24305 24305              \_ /home2/snc/griduser/duong/cpi  
> viz002 43827   4amslave -p4yourname viz001 -p4rmrank 3

This is only a loose integration, as no qrsh is used to start the  
slave processes als children of the  sge_shepherd on the slave node 
(s). - Reuti


> The SGE installation is shared across all nodes using NFS. As you  
> can see above, RSHCOMMAND=ssh.
>
> Looking forward to hearing from you. Thank you very much.
>
> Best regards,
> Duong
>
> On 4/18/06, Reuti <reuti at staff.uni-marburg.de> wrote: Hi,
>
>
> so you would prefer using Loose Integration on your system. Would you
> please post your loose PE definition and a ps output like you
> mentioned below. The SGE installation is shared across all nodes and
> not a local one? Which rsh command is compiled into your MPICH
> installation?
>
> -- Reuti
>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
For additional commands, e-mail: users-help at gridengine.sunsource.net




More information about the gridengine-users mailing list