[GE users] setting up mpich2 pe + qrsh

Jeroen Kleijer jeroen.kleijer at xs4all.nl
Thu Feb 10 06:44:14 GMT 2005


Hi

I'm running SGE6.0u3.
This is my first attempt at setting up a p.e. environment so I don't
have any other parallel applications running with SGE.

Qrsh works properly. When run by hand it gives me a remote shell but
running it the same way as in the script by hand gives me an error about
the JOBID not being set.

Kind regards,

Jeroen Kleijer

On Thu, Feb 10, 2005 at 12:04:07AM +0100, Reuti wrote:
> Hi,
> 
> which SGE version are you using? When you run other parallel applications, the 
> qrsh is working as it should?
> 
> CU - Reuti
> 
> Quoting Jeroen Kleijer <jeroen.kleijer at xs4all.nl>:
> 
> > 
> > Hi all,
> > 
> > I'm setting up an MPICH2 parallel environment with tight integration
> > according to the hints given by Reuti in post:
> > http://gridengine.sunsource.net/servlets/ReadMsg?msgId=2291&listName=users
> > 
> > I compiled mpich2 (with the PGI compiler suite), created a parallel
> > environment mpich2 which in turn runs the script startmpich2.sh as done
> > by Reuti. (it had some minor errors in it but these were easily fixed)
> > 
> > The problem I'm running into at the moment is that I want to use the
> > smpd solution provided in the post and thus, the startmpich2.sh script
> > needs to do a qrsh to every machine in the $machines file and start a
> > smpd daemon.
> > 
> > With every qrsh I run from startmpich2.sh I get the following error:
> > 
> > error: getting configuration: unable to send message to qmaster using
> > port 0 on host "<qmastername>": no valid port number
> > error:
> > Cannot get configuration from qmaster
> > 
> > The qrsh command in the script looks like this:
> > $SGE_ROOT/bin/$ARC/qrsh -V -inherit $node "/cadappl/mpich2/1.0/bin/smpd
> > -s -port $SGE_PORTID"
> > 
> > It doesn't really matter what command I use instead of smpd, I've tried
> > doing a simple mkdir /tmp/$SGE_PORTID and it gave me the same error
> > message.
> > 
> > Has anyone seen this message before?
> > 
> > Jeroen Kleijer
> > 
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
> > For additional commands, e-mail: users-help at gridengine.sunsource.net
> > 
> 
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
> For additional commands, e-mail: users-help at gridengine.sunsource.net
> 

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
For additional commands, e-mail: users-help at gridengine.sunsource.net




More information about the gridengine-users mailing list