[GE users] MPICH2 tight integration with SGE

Goncalo Borges goncalo at lip.pt
Fri Apr 7 11:51:37 BST 2006

> > 
> > Accounting, at least at this point, is not important for us. But it will
> > be in a near future. Nevertheless, I'm tending to configure qrsh and qlogin
> > to use ssh. In this case what is the conclusion regarding MPICH2
> > integration? Is it possible to get the MPICH2 tight integration with SGE
> > following the previous how-to instructions?
> Why not just trying it?
> -- Reuti

Dear All,
It seems I have sucessed implementing MPICH2 tight integration 
with SGE using ssh instead of the rsh (which is the default 
protocol used in the mpich2 integration how-to:

Until now I only have implemented smpd startup method in its daemonless 
configuration. There are some pre-requesites one has to guaranty:

1) Define a ./smpd file in the user home directory in all exec nodes (if 
there is no shared homes):
[user at lflip02 ~]$ cat .smpd

2) ssh identification based on host keys (no password required). 
A user should be able to ssh other machines without introducing password, 
including ssh from the present host to itself.

3) Configure qrsh and qlogin to use ssh, as explained in

I created a new directory named $SGE_ROOT/mpich2_smpd_ssh and copied all 
the scripts in $SGE_ROOT/mpich2_smpd_rsh to it. One has to slightly 
change the scripts which are used:

a) startmpich2.sh script is going to create the machine file in the $TMP 
directory and put ssh wrapper available in the $TMP directory. Basicaly, I 
just substituted the "rsh variable names" by "ssh variable names" such as 
"rsh_wrapper=$SGE_ROOT/mpich2_smpd_rsh/rsh" by "ssh_wrapper=$SGE_ROOT/mpich2_smpd_ssh/ssh"
but, if you want, this can remain unchanged.

b) I moved the rsh wrapper and named it ssh. The default rsh wrapper tests 
some of the options used in rsh command. Since the same options are not used 
in ssh, one has to take care that the correct parsing is done for ssh. 

The important part of this ssh wrapper is:

if [ x$just_wrap = x ]; then
   echo $SGE_ROOT/bin/$ARC/qrsh -inherit $rhost $cmd
   exec $SGE_ROOT/bin/$ARC/qrsh -inherit $rhost $cmd
   echo $me $rhost $*
   exec $me $rhost $cmd
   echo $me not found in PATH=$PATH

By default, qrsh will be used since $just_wrap is always an empty 
variable. One can used ssh directly setting in this ssh wrapper 
just_wrap=1. The $cmd will pass important mpi environment variables 
such had PMI_RANK, PMI_SIZE=10, PMI_ROOT_HOST=lflip02.lip.pt (etc...) 
and executable one would like to run.

In the user script, one had to do:
export MPIEXEC_RSH=ssh
mpiexec -ssh -nopm -n $NSLOTS -machinefile $TMPDIR/machines $HOME/<executable>

If we do not export MPIEXEC_RSH=ssh, MPICH will use "ssh -x" by default 
and qrsh will not recognize the option.

c) Finally, as in the startmpich2.sh, I only changed "rsh variable names" 
by "ssh variable names" in the stopmpich2.sh script.

I hope this information maybe usefull for other persons.

To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
For additional commands, e-mail: users-help at gridengine.sunsource.net

More information about the gridengine-users mailing list