[GE users] setting up mpich2 pe + qrsh

Jeroen Kleijer jeroen.kleijer at xs4all.nl
Wed Feb 9 21:26:26 GMT 2005


Hi all,

I'm setting up an MPICH2 parallel environment with tight integration
according to the hints given by Reuti in post:
http://gridengine.sunsource.net/servlets/ReadMsg?msgId=2291&listName=users

I compiled mpich2 (with the PGI compiler suite), created a parallel
environment mpich2 which in turn runs the script startmpich2.sh as done
by Reuti. (it had some minor errors in it but these were easily fixed)

The problem I'm running into at the moment is that I want to use the
smpd solution provided in the post and thus, the startmpich2.sh script
needs to do a qrsh to every machine in the $machines file and start a
smpd daemon.

With every qrsh I run from startmpich2.sh I get the following error:

error: getting configuration: unable to send message to qmaster using
port 0 on host "<qmastername>": no valid port number
error:
Cannot get configuration from qmaster

The qrsh command in the script looks like this:
$SGE_ROOT/bin/$ARC/qrsh -V -inherit $node "/cadappl/mpich2/1.0/bin/smpd
-s -port $SGE_PORTID"

It doesn't really matter what command I use instead of smpd, I've tried
doing a simple mkdir /tmp/$SGE_PORTID and it gave me the same error
message.

Has anyone seen this message before?

Jeroen Kleijer

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
For additional commands, e-mail: users-help at gridengine.sunsource.net




More information about the gridengine-users mailing list