[GE users] Help: PE Question

Amy Lee openlinuxsource at gmail.com
Sat Sep 22 15:18:52 BST 2007


    [ The following text is in the "ISO-8859-1" character set. ]
    [ Your display is set for the "ISO-8859-10" character set.  ]
    [ Some special characters may be displayed incorrectly. ]

Chris Dagdigian wrote:
>
> Forgot to address this question
>
> On Sep 22, 2007, at 8:34 AM, Amy Lee wrote:
>> I use MPICH 1.4, and there are some parameters 
>> /opt/sge/mpi/startmpi.sh and stopmpi.sh scripts, I wanna know the 
>> meanings of them. How to combine the scripts with MPICH?
>
> In a general sense the point of the startmpi.sh and stopmpi.sh scripts 
> is to be the hooks at which point your SGE install touches your MPICH 
> install.
>
> In a loose integration setting, the purpose of the startmpi.sh script 
> is really just this:
>
> - Take the list of hosts kicked out by the SGE scheduler
> - Format that host list into a file that is compatible with your 
> specific MPI installation (mpich machines file in your case)
> - Do anything else necessary to prepare your MPI environment (lamboot 
> for instance in LAM-MPI installs)
>
> The main purpose is the creation of the job specific machines file 
> that your personal MPICH environment will read in.
>
> The act of SGE creating a custom machinesfile (via startmpi.sh)  for 
> your parallel job is the place where SGE touches MPICH.
>
> For your MPICH-1.4 installation the sole point of the startmpi.sh 
> script would be to create the $TMPDIR/machines file that you would use 
> when your job script calls "mpirun -slots $NSLOTS -machinefile 
> $TMPDIR/machines ./my-parallel-app" or whatever
>
> The stopmpi.sh script is a place/hook that allows for cleaning up and 
> shutting down your parallel job (lamhalt in LAM-MPI installations for 
> example). For MPICH-1.4 I've never really had to use a stopmpi.sh 
> script for anything at all.
>
> -Chris
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
> For additional commands, e-mail: users-help at gridengine.sunsource.net
>
>
Thank you very much~
And I still have 2 problem which I wanna know.
1). I have 4 nodes, and there are 2 Opteron 270 processors per node. So 
in the MPICH machine file which is located in 
/usr/local/mpi/share/machine.LINUX is like this:
gnode2:4
gnode3:4
gnode4:4
gnode5:4
The gnode1 is the master. Whether the formation is okay when I use 
startmpi.sh to operate?

2). What's the $TMPDIR variable? When I finished installing MPICH, I 
didn't add any variables, but it still runs well. Shall I add some 
variables in Linux system?

Regards,

Amy Lee

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
For additional commands, e-mail: users-help at gridengine.sunsource.net




More information about the gridengine-users mailing list