[GE users] Problem with tight integration of mpich2

Mark Whidby mark.whidby at manchester.ac.uk
Thu Jan 24 11:39:16 GMT 2008

I'm trying to set up tight integration of mpich2 following the guidelines
at http://gridengine.sunsource.net/howto/mpich2-integration/mpich2-integration.html
for the daemonless smpd startup method.

My job file looks like this:-


# request Bourne shell as shell for job
#$ -S /bin/sh

export MPIEXEC_RSH=rsh
export PATH=/home/zlsiimw/opt/mpich2/1.0.6p1-smpd/bin:$PATH

echo RSH is: `which rsh`
echo MPIEXEC is: `which mpiexec`

mpiexec -rsh -nopm -n $NSLOTS -machinefile $TMPDIR/machines /home/zlsiimw/Beowulf/mpihello/mpihello

exit 0

The output file looks like this:-

RSH is: /tmp/1002.1.terra/rsh
MPIEXEC is: /home/zlsiimw/opt/mpich2/1.0.6p1-smpd/bin/mpiexec
NSLOTS is: 4
/opt/sge/bin/lx24-amd64/qrsh -inherit node3 env
/opt/sge/bin/lx24-amd64/qrsh -inherit node2 env
/opt/sge/bin/lx24-amd64/qrsh -inherit node4 env
/opt/sge/bin/lx24-amd64/qrsh -inherit node1 env
SSH_CLIENT= 59599 51061
*****output from four 'env' commands omitted*****

The only thing getting executed on the compute nodes is the 'env' command.
The four qrsh lines should have some environmental variables and a
command line appended to them, but they don't.
Does anybody have any idea what is going wrong? I've tried rebuilding
mpich2 a couple of times now in case I missed something but I'm at a
bit of a dead end now.
Thanks in anticipation...

Mark Whidby
Information Systems
Faculty of Engineering and Physical Sciences

