[GE users] mpich2 tight integration not working

reuti reuti at staff.uni-marburg.de
Mon Mar 9 21:01:07 GMT 2009


Am 09.03.2009 um 20:56 schrieb kennethsdsc:

> On Mon, 9 Mar 2009, reuti wrote:
>
>> Date: Mon, 9 Mar 2009 20:34:56 +0100
>> From: reuti <reuti at staff.uni-marburg.de>
>> Reply-To: users <users at gridengine.sunsource.net>
>> To: users at gridengine.sunsource.net
>> Subject: Re: [GE users] mpich2 tight integration not working
>>
>> Am 09.03.2009 um 19:46 schrieb kennethsdsc:
>>
>>> A couple other issues:
>>>
>>> - I had to specify task count in my qsub line:
>>> qsub -t 1-4:4 -l h_rt=18:00:00 -q all.q -pe mpich2_mpd 4 testjob.sh
>>>
>>> - I had to use SGE_TASK_ID, instead of TASK_ID in mpich2_mpd.sh:
>>> #export MPICH2_ROOT=/usr/local/apps/sge/mpich2/install
>>> #export PATH=$MPICH2_ROOT/bin:$PATH
>>> #export MPD_CON_EXT="sge_$JOB_ID.$TASK_ID"
>>> setenv MPICH2_ROOT /usr/local/apps/sge/mpich2/install
>>> setenv PATH $MPICH2_ROOT/bin:$PATH
>>> setenv MPD_CON_EXT "sge_$JOB_ID.$SGE_TASK_ID"
>>>
>>> It looks like SGE is using csh to execute the file, rather
>>> than using #!/bin/ksh.  Not sure if that's a configuration issue on
>>> my part?
>>
>>
>> For the prolog/epilog it should just exec the specified binaries. You
>> are on which platform? /bin/bash is available?
>
> Sorry, I was unclear.  The job script, mpich2_mpd.sh, gets executed
> by csh, rather than its #!/bin/ksh line.  That makes the export lines
> not work.

The original was /bin/sh, but /bin/ksh should work as well.

>
>>
>> The queue settings for the interpreter should only affect the
>> execution of the jobscript, not the prolog/epilog. Can you please
>> post your queue definition?
>
> Here's the qconf -sq output:
>
> [root at ken1 testprog]# qconf -sq all.q
> qname                 all.q
> hostlist              @allhosts
> seq_no                0
> load_thresholds       np_load_avg=1.75
> suspend_thresholds    NONE
> nsuspend              1
> suspend_interval      00:05:00
> priority              0
> min_cpu_interval      00:05:00
> processors            UNDEFINED
> qtype                 BATCH INTERACTIVE
> ckpt_list             NONE
> pe_list               make mpich2_mpd
> rerun                 FALSE
> slots                 4,[ken1=2],[ken2=2]
> tmpdir                /tmp
> shell                 /bin/csh

shell /bin/sh

> prolog                NONE
> epilog                NONE
> shell_start_mode      posix_compliant

shell_start_mode unix_behavior

will honor the first line of the script for the requested shell.  
Another option would be to submit all jobs with "-S /bin/ksh". Or set  
"shell /bin/ksh" above and leaver the setting of posix_compliant.  
Please have a look at "man queue_conf" for the exact behavior of  
these settings.

-- Reuti
>

------------------------------------------------------
http://gridengine.sunsource.net/ds/viewMessage.do?dsForumId=38&dsMessageId=125853

To unsubscribe from this discussion, e-mail: [users-unsubscribe at gridengine.sunsource.net].



More information about the gridengine-users mailing list