[GE users] mpich2 tight integration not working

kennethsdsc kenneth at sdsc.edu
Mon Mar 9 18:46:07 GMT 2009


A couple other issues:

- I had to specify task count in my qsub line:
qsub -t 1-4:4 -l h_rt=18:00:00 -q all.q -pe mpich2_mpd 4 testjob.sh

- I had to use SGE_TASK_ID, instead of TASK_ID in mpich2_mpd.sh:
#export MPICH2_ROOT=/usr/local/apps/sge/mpich2/install
#export PATH=$MPICH2_ROOT/bin:$PATH
#export MPD_CON_EXT="sge_$JOB_ID.$TASK_ID"
setenv MPICH2_ROOT /usr/local/apps/sge/mpich2/install
setenv PATH $MPICH2_ROOT/bin:$PATH
setenv MPD_CON_EXT "sge_$JOB_ID.$SGE_TASK_ID"

It looks like SGE is using csh to execute the file, rather
than using #!/bin/ksh.  Not sure if that's a configuration issue on
my part?

Kenneth

On Mon, 9 Mar 2009, kennethsdsc wrote:

> Date: Mon, 9 Mar 2009 11:39:16 -0700 (PDT)
> From: kennethsdsc <kenneth at sdsc.edu>
> Reply-To: users <users at gridengine.sunsource.net>
> To: users at gridengine.sunsource.net
> Subject: RE: [GE users] mpich2 tight integration not working
> 
> I also am playing with tight mpich2_mpd integration with sge 62u2.
> I'm not sure if my problem is related to yours.  I found some
> mismatches in the start scripts and what SGE is setting.  I was able to
> get the mpihello.c to work, by modifying start and stop scripts.
>
> It looks like SGE is not setting TASK_ID in the environment,
> but is setting SGE_TASK_ID, so I modified startmpich2.sh:
>
> #export MPD_CON_EXT="sge_$JOB_ID.$TASK_ID"
> export MPD_CON_EXT="sge_$JOB_ID.$SGE_TASK_ID"
>
> I also had to give stopmpich2.sh the full path to mpdallexit:
> #mpdallexit
> /usr/local/apps/sge/mpich2/install/bin/mpdallexit
>
> Kenneth
>
> On Thu, 4 Dec 2008, Patterson, Ron (NIH/NLM/NCBI) [C] wrote:
>
>> Date: Thu, 4 Dec 2008 14:25:47 -0500
>> From: "Patterson, Ron (NIH/NLM/NCBI) [C]" <patterso at ncbi.nlm.nih.gov>
>> Reply-To: users <users at gridengine.sunsource.net>
>> To: users at gridengine.sunsource.net
>> Subject: RE: [GE users] mpich2 tight integration not working
>>
>> Reuti,
>>
>>> you set "job_is_first_task  FALSE" in the PE?
>>
>> No - I had it set to TRUE. I made the change and my first test was
>> successful. Thank you very much for your amazingly speedy reply.
>>
>> Ron
>>
>> ------------------------------------------------------
>> http://gridengine.sunsource.net/ds/viewMessage.do?dsForumId=38&dsMessageId=91204
>>
>> To unsubscribe from this discussion, e-mail: [users-unsubscribe at gridengine.sunsource.net].
>>
>
> ------------------------------------------------------
> http://gridengine.sunsource.net/ds/viewMessage.do?dsForumId=38&dsMessageId=125724
>
> To unsubscribe from this discussion, e-mail: [users-unsubscribe at gridengine.sunsource.net].
>

------------------------------------------------------
http://gridengine.sunsource.net/ds/viewMessage.do?dsForumId=38&dsMessageId=125727

To unsubscribe from this discussion, e-mail: [users-unsubscribe at gridengine.sunsource.net].



More information about the gridengine-users mailing list