[GE users] Integration of the MPICH2 and SGE

reuti reuti at staff.uni-marburg.de
Wed May 26 17:00:50 BST 2010


    [ The following text is in the "utf-8" character set. ]
    [ Your display is set for the "ISO-8859-10" character set.  ]
    [ Some characters may be displayed incorrectly. ]

Hi,

Am 26.05.2010 um 16:50 schrieb gqc606:

>  I compiled MPICH2 after I configured it,attached the PE to the cluster queue of my choice and to adjusted the path to my MPICH installation successfully.Do exactly as the page shows:
> 
> http://gridengine.su?nsource.net/howto/mp?ich2-integration/mpi?ch2-integration.html? 
> http://marc.info/?l=?npaci-rocks-discussi?on?&m=127481216829722&w=2
> 
> 
> According to "http://gridengine.su?nsource.net/ds/viewM?essage.do?dsForumId=?38&dsMess \
> ageId=257043", Reuti says that I have to also edit one line of the provided \
> "startmpich2.sh" script to make it work correctly with Rocks:
> 
>   # vi $SGE_ROOT/mpich2_mp?d/startmpich2.sh
> 
>   Jump down to line 176 where it says:
> 
>   NODE=`hostname`
> 
>   and change it to:
> 
>   NODE=`hostname --short`

this was done in the expectation, that you installed SGE with "ignore_fqdn yes".


> And I have done it. But it is strange,when I submitted my script,the following errors occurs:
> 
> -catch_rsh /opt/gridengine/defa?ult/spool/compute-0-?1/active_jobs/240.1/?pe_hostfile /opt/mpich2/gnu
> compute-0-1:3
> compute-0-0:3
> usage: start_mpich2 [-n <hostname>] mpich2-mpd-path [mpd-parameters ..]
> where: 'hostname' gives the name of the target host
> usage: start_mpich2 [-n <hostname>] mpich2-mpd-path [mpd-parameters ..]

Okay, to get to the root of it:

can you please add an `echo` command before the line 181 (i.e. just echo the command, which will be executed next).

echo $SGE_ROOT/mpich2_mpd/bin/$ARC/start_mpich2 -n $host $MPICH2_ROOT/bin/mpd

and a new line 201:

echo $SGE_ROOT/mpich2_mpd/bin/$ARC/start_mpich2 -n $host $MPICH2_ROOT/bin/mpd $NODE $PORT

and post the generated lines. Then we can check, with which parameter the helping program was called (adjust to your paths).

-- Reuti


> where: 'hostname' gives the name of the target host
> startmpich2.sh: check for mpd daemons (1 of 10)
> startmpich2.sh: check for mpd daemons (2 of 10)
> startmpich2.sh: check for mpd daemons (3 of 10)
> startmpich2.sh: check for mpd daemons (4 of 10)
> startmpich2.sh: check for mpd daemons (5 of 10)
> startmpich2.sh: check for mpd daemons (6 of 10)
> startmpich2.sh: check for mpd daemons (7 of 10)
> startmpich2.sh: check for mpd daemons (8 of 10)
> startmpich2.sh: check for mpd daemons (9 of 10)
> startmpich2.sh: check for mpd daemons (10 of 10)
> startmpich2.sh: got only 8 of 2 nodes, aborting
> -catch_rsh /opt/mpich2/gnu
> mpdallexit: cannot connect to local mpd (/tmp/mpd2.console_t?est_sge_240.undefine?d); possible causes:
>  1. no mpd is running on this host
>  2. an mpd is running but was started without a "console" (-n option)
> In case 1, you can start an mpd on this host with:
>    mpd &
> and you will be able to run jobs just on this host.
> For more details on starting mpds on a set of hosts, see
> the MPICH2 Installation Guide.
> 
> 
> Here is my script:
> #!/bin/sh
> #
> #$ -cwd
> #$ -j y
> #$ -S /bin/bash
> #$ -N flat_airebo
> #$ -pe mpich2_mpd 6
> #$ -q all.q
> #$ -e error.out
> #$ -o screen.out
> #
> export MPICH2_ROOT=/opt/mpich2/gnu
> export PATH=$MPICH2_ROOT/bin:$PATH
> export MPD_CON_EXT="sge_$J?OB_ID.$SGE_TASK_ID"?
> /opt/mpich2/gnu/bin/mpiexec -machinefile $TMPDIR/machines -n $NSLOTS /home/test/mpi-ring
> exit 0
> 
> I was confused,it shouldn't occur such a mistake .who can give me some advices?Thanks!
> 
> ------------------------------------------------------
> http://gridengine.sunsource.net/ds/viewMessage.do?dsForumId=38&dsMessageId=258691
> 
> To unsubscribe from this discussion, e-mail: [users-unsubscribe at gridengine.sunsource.net].
>

------------------------------------------------------
http://gridengine.sunsource.net/ds/viewMessage.do?dsForumId=38&dsMessageId=258706

To unsubscribe from this discussion, e-mail: [users-unsubscribe at gridengine.sunsource.net].



More information about the gridengine-users mailing list