[GE users] Anybody able to get SGE MPICH2 Tight Integration via SSH working?

Jan Behrend jbehrend at mpifr-bonn.mpg.de
Thu Jan 5 08:38:02 GMT 2006


    [ The following text is in the "ISO-8859-1" character set. ]
    [ Your display is set for the "ISO-8859-10" character set.  ]
    [ Some special characters may be displayed incorrectly. ]

Hi Jonathan,

I had the same decision to make wether to use ssh or not secure rsh
protocols.  What I came up with is a VPN connection to all execution
hosts which are not in a cluster behind a head node.  With this I did a
tight inegration of LAM/MPI.  This actually works like a charm.  With
this you have both benefits: easy rsh (hosts.equiv) managemant AND the
security of a VPN.  Unfortunately there is a con side:  You have to
watch the mount and start sequence of the VPN and the nfs mounted
SGE_ROOT.  So if for some reason the VPN dies, you are left with no
SGE_ROOT on the execution host. But other than this it's all ok. ;-)

Cheers Jan

Jonathan Schreiter wrote:
> Hello all,
> I'm new to SGE, and trying to enable tight integration
> with mpich2 and ssh (SGE 6.07, mpich2 1.0.3, FC4 linux
> 2.6 kernel with latest ssh).  I found the two howto's
> on the project site re ssh integegration and
> integration with mpich2 via rsh.  The rsh security
> concerns are the primary reason I'd like to use ssh -
> (specifically ssh based passphraseless keys on a per
> user basis which is a bit better) - also because of
> the requirement to disable the firewall for mpich2 to
> work with dynamic port assignments (even if one
> specifies the primary listen to port).
> 
> If I use the original scripts included in
> $SGE_ROOT/mpi and have smpd started on the execution
> hosts, I am able to sucessfully submit and execute mpi
> jobs via a PE mpich2 environment on SGE.  I can also
> start the smpd process on the exe hosts via a submit
> job using ssh.  However, I do not know how one could
> ever implement SGE w/o tight integration this way with
> failed scripts / memory leaks / limbo processes, etc. 
> 
> 
> So I've been following the section "Tight Integration
> of the daemon-based smpd startup method" closely. 
> Looking at the start_mpich2.c file, there doesn't
> appear to be any rsh specific methods that need
> changing (just a fork()).  In startmpi2.sh the area
> where I think needs modification is:
> 
> rshcmd=rsh
> to something like rshcmd="ssh -i <~/.ssh/user's
> passphraseless key>"
> 
> I have the $SGE_ROOT/mpich2_smpd and home directories
> shared on each execution host (and master/submit
> hosts).
> 
> If I try to execute the line:
> $SGE_ROOT/mpich2_smpd/bin/lx24-x86/start_mpich2 -n
> <host> $MPICH2_ROOT/bin/smpd <port> from a bash shell
> I recieve connection refused errors (naturally).  I'm
> not 100% sure how the RSH wrapper script and the howto
> on ssh integration work together to make this happen. 
> 
> I guess what I'm asking is if anyone was able to get
> this working, and how, rather then reinvent the
> wheel...or perhaps I'm just way off.  I've been
> reading just about all the posts on this mailing list
> and I haven't found anyone who's been sucessful (or at
> least posted the solution).  It may n ot even be
> possible given the differences between rsh and ssh. 
> 
> Any help would be greatly appreciated!
> 
> Many thanks,
> Jonathan
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
> For additional commands, e-mail: users-help at gridengine.sunsource.net
> 


-- 
Jan Behrend
Max-Planck-Institut für Radioastronomie
Abteilung für Infrarot-Interferometrie  Tel:   (+49) 228 525 319
Auf dem Hügel 69                        Fax:   (+49) 228 525 411
D-53121 Bonn (Germany)                  jbehrend at mpifr-bonn.mpg.de
                                        http://www.mpifr-bonn.mpg.de

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
For additional commands, e-mail: users-help at gridengine.sunsource.net




More information about the gridengine-users mailing list