[GE users] Help: PE Question

Chris Dagdigian dag at sonsorol.org
Sat Sep 22 14:44:22 BST 2007


Forgot to address this question

On Sep 22, 2007, at 8:34 AM, Amy Lee wrote:
> I use MPICH 1.4, and there are some parameters /opt/sge/mpi/ 
> startmpi.sh and stopmpi.sh scripts, I wanna know the meanings of  
> them. How to combine the scripts with MPICH?

In a general sense the point of the startmpi.sh and stopmpi.sh  
scripts is to be the hooks at which point your SGE install touches  
your MPICH install.

In a loose integration setting, the purpose of the startmpi.sh script  
is really just this:

- Take the list of hosts kicked out by the SGE scheduler
- Format that host list into a file that is compatible with your  
specific MPI installation (mpich machines file in your case)
- Do anything else necessary to prepare your MPI environment (lamboot  
for instance in LAM-MPI installs)

The main purpose is the creation of the job specific machines file  
that your personal MPICH environment will read in.

The act of SGE creating a custom machinesfile (via startmpi.sh)  for  
your parallel job is the place where SGE touches MPICH.

For your MPICH-1.4 installation the sole point of the startmpi.sh  
script would be to create the $TMPDIR/machines file that you would  
use when your job script calls "mpirun -slots $NSLOTS -machinefile  
$TMPDIR/machines ./my-parallel-app" or whatever

The stopmpi.sh script is a place/hook that allows for cleaning up and  
shutting down your parallel job (lamhalt in LAM-MPI installations for  
example). For MPICH-1.4 I've never really had to use a stopmpi.sh  
script for anything at all.

-Chris


---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
For additional commands, e-mail: users-help at gridengine.sunsource.net




More information about the gridengine-users mailing list