[GE users] Integration of the MPICH2 and SGE

gqc606 gqc606 at hotmail.com
Tue May 18 15:10:43 BST 2010


> Hi,
> 
> Am 15.05.2010 um 16:06 schrieb gqc606:
> 
> >  Hello,I installed Rocks 5.3 on my computers,I would like to use SGE to manage my MPICH2.In this system,it use the daemonless smpd to startup MPICH2.
> > [test at cluster ~]$ qconf -sp mpich 
> > pe_name            mpich
> > slots              9999
> > user_lists         NONE
> > xuser_lists        NONE
> > start_proc_args    /opt/gridengine/mpi/startmpi.sh -catch_rsh $pe_hostfile
> > stop_proc_args     /opt/gridengine/mpi/stopmpi.sh
> > allocation_rule    $fill_up
> > control_slaves     TRUE
> > job_is_first_task  FALSE
> > urgency_slots      min
> > accounting_summary TRUE
> > 
> > This is my script:
> > #!/bin/bash
> > #
> > #$ -cwd
> > #$ -j y
> > #$ -S /bin/bash
> > #$ -N flat_airebo
> > #
> > #$ -pe mpich 6
> > #$ -q all.q
> > #
> > #$ -e error.out
> > #$ -o screen.out
> > 
> > export MPICH2_ROOT=/opt/mpich2/gnu
> > export PATH=$MPICH2_ROOT/bin:$PATH
> > export MPIEXEC_RSH=rsh
> > 
> > mpiexec -rsh -nopm -n $NSLOTS -machinefile $TMPDIR/machines /home/test/mpi-ring
> > 
> > But when I submit my script, the following error occurs:
> > -catch_rsh /opt/gridengine/default/spool/compute-0-1/active_jobs/179.1/pe_hostfile
> > compute-0-1
> > compute-0-1
> > compute-0-1
> > compute-0-1
> > compute-0-0
> > compute-0-0
> > mpiexec_compute-0-1.local: cannot connect to local mpd (/tmp/mpd2.console_test); possible causes:
> >  1. no mpd is running on this host
> >  2. an mpd is running but was started without a "console" (-n option)
> > In case 1, you can start an mpd on this host with:
> >    mpd &
> > and you will be able to run jobs just on this host.
> > For more details on starting mpds on a set of hosts, see
> > the MPICH2 Installation Guide.
> > 
> > I don't know where the error occurred.I read the page <http://gridengine.sunsource.net/howto/mpich2-integration/mpich2-integration.html> for several times,and didn't find the error.Who can give me some advices?Thanks!
> 
> this is strange, as it shouldn't look for the mpd. You recompiled MPICH2, use the smpd-mpiexec in the script (correct path?) and also recompiled your application? mipexec's from different ways to compile MPICH2 are not interchangeable.

MPICH2's correct path is /opt/mpich2/gnu.
[root at cluster ~]# ls /opt/mpich2/gnu/bin
mpd              mpdexit         mpdlistjobs.pyc  mpdsigjob.pyo
mpdallexit       mpdexit.py      mpdlistjobs.pyo  mpdtrace
mpdallexit.py    mpdexit.pyc     mpdman.py        mpdtrace.py
mpdallexit.pyc   mpdexit.pyo     mpdman.pyc       mpdtrace.pyc
mpdallexit.pyo   mpdgdbdrv.py    mpdman.pyo       mpdtrace.pyo
mpdboot          mpdgdbdrv.pyc   mpd.py           mpicc
mpdboot.py       mpdgdbdrv.pyo   mpd.pyc          mpich2version
mpdboot.pyc      mpdhelp         mpd.pyo          mpicxx
mpdboot.pyo      mpdhelp.py      mpdringtest      mpiexec
mpdcheck         mpdhelp.pyc     mpdringtest.py   mpiexec.py
mpdcheck.py      mpdhelp.pyo     mpdringtest.pyc  mpiexec.pyc
mpdcheck.pyc     mpdkilljob      mpdringtest.pyo  mpiexec.pyo
mpdcheck.pyo     mpdkilljob.py   mpdroot          mpif77
mpdchkpyver.py   mpdkilljob.pyc  mpdrun           mpif90
mpdchkpyver.pyc  mpdkilljob.pyo  mpdrun.py        mpirun
mpdchkpyver.pyo  mpdlib.py       mpdrun.pyc       mpirun.py
mpdcleanup       mpdlib.pyc      mpdrun.pyo       mpirun.pyc
mpdcleanup.py    mpdlib.pyo      mpdsigjob        mpirun.pyo
mpdcleanup.pyc   mpdlistjobs     mpdsigjob.py     parkill
mpdcleanup.pyo   mpdlistjobs.py  mpdsigjob.pyc

In this file,I could not find smpd or smpd-mpiexec.Is my MPICH2 installed incorrectly,or my rsh don't connect to other nodes which leading to I can't start mpich2 ?
>
> -- Reuti
> 
> 
> > 
> > ------------------------------------------------------
> > http://gridengine.sunsource.net/ds/viewMessage.do?dsForumId=38&dsMessageId=257394
> > 
> > To unsubscribe from this discussion, e-mail: [users-unsubscribe at gridengine.sunsource.net].

------------------------------------------------------
http://gridengine.sunsource.net/ds/viewMessage.do?dsForumId=38&dsMessageId=257755

To unsubscribe from this discussion, e-mail: [users-unsubscribe at gridengine.sunsource.net].



More information about the gridengine-users mailing list