[GE users] Tight Integration of MPICH with SGE

Waseem Ahmad Waseem.Ahmad.1 at Sun.COM
Tue Jul 27 19:14:15 BST 2004


I think the MPI Jobs spawned through the perl script are unable to get 
$NSLOTS. cat $TMPDIR/machines gives me the list of nodes in the cluster 
and thus looks to be fine.
  What should i do to pass these variables to MPI jobs.
thanks.
John Hearns wrote:
> On Tue, 2004-07-27 at 16:46, Waseem Ahmad wrote:
> 
>>Reuti!
>>
>>All of the required environment variables are set on slave nodes 
>>too.Yes, i am using ch_p4 device. The broken pipe error is corrected 
>>now. It was reported by SGE .e* files. Instead i get the following error.
>>Cannot read /tmp/machines.
>>Looked for files with extension solaris in
>>directory /gridware/sge/mpich-1.2.5.2/util/machines
>>This is reported in the Programme output.
>>Note that i am able to run the sample script for testing tight 
>>integration provided in the mpi directory. But when i try to run my perl 
>>script which spawns mpi jobs through mpirun, i get above mentioned problem.
>>
> 
> In your script, try using:
> mpirun -np $NSLOTS -machinefile $TMPDIR/machines
> 
> $NSLOTS is the number of slots Gridengine has allocated to the job
> $TMPDIR is the temporary directory where the machines file is located.
> As a quick debug, try to cat $TMPDIR/machines also so you can see which
> nodes have been allocated.
> 
> 
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
> For additional commands, e-mail: users-help at gridengine.sunsource.net
> 
> 



---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
For additional commands, e-mail: users-help at gridengine.sunsource.net




More information about the gridengine-users mailing list