[GE users] mpich2 tight integration not working

kennethsdsc kenneth at sdsc.edu
Mon Mar 9 21:10:23 GMT 2009


> shell_start_mode unix_behavior

That works!  That's the behavior I expect from job scripts.

I have to say, your writeup on mpich2/sge tight integration
was one of the better pieces of SGE documentation I've seen.

Thanks,
Kenneth


On Mon, 9 Mar 2009, reuti wrote:

> Date: Mon, 9 Mar 2009 22:01:07 +0100
> From: reuti <reuti at staff.uni-marburg.de>
> Reply-To: users <users at gridengine.sunsource.net>
> To: users at gridengine.sunsource.net
> Subject: Re: [GE users] mpich2 tight integration not working
> 
> Am 09.03.2009 um 20:56 schrieb kennethsdsc:
>
>> On Mon, 9 Mar 2009, reuti wrote:
>>
>>> Date: Mon, 9 Mar 2009 20:34:56 +0100
>>> From: reuti <reuti at staff.uni-marburg.de>
>>> Reply-To: users <users at gridengine.sunsource.net>
>>> To: users at gridengine.sunsource.net
>>> Subject: Re: [GE users] mpich2 tight integration not working
>>>
>>> Am 09.03.2009 um 19:46 schrieb kennethsdsc:
>>>
>>>> A couple other issues:
>>>>
>>>> - I had to specify task count in my qsub line:
>>>> qsub -t 1-4:4 -l h_rt=18:00:00 -q all.q -pe mpich2_mpd 4 testjob.sh
>>>>
>>>> - I had to use SGE_TASK_ID, instead of TASK_ID in mpich2_mpd.sh:
>>>> #export MPICH2_ROOT=/usr/local/apps/sge/mpich2/install
>>>> #export PATH=$MPICH2_ROOT/bin:$PATH
>>>> #export MPD_CON_EXT="sge_$JOB_ID.$TASK_ID"
>>>> setenv MPICH2_ROOT /usr/local/apps/sge/mpich2/install
>>>> setenv PATH $MPICH2_ROOT/bin:$PATH
>>>> setenv MPD_CON_EXT "sge_$JOB_ID.$SGE_TASK_ID"
>>>>
>>>> It looks like SGE is using csh to execute the file, rather
>>>> than using #!/bin/ksh.  Not sure if that's a configuration issue on
>>>> my part?
>>>
>>>
>>> For the prolog/epilog it should just exec the specified binaries. You
>>> are on which platform? /bin/bash is available?
>>
>> Sorry, I was unclear.  The job script, mpich2_mpd.sh, gets executed
>> by csh, rather than its #!/bin/ksh line.  That makes the export lines
>> not work.
>
> The original was /bin/sh, but /bin/ksh should work as well.
>
>>
>>>
>>> The queue settings for the interpreter should only affect the
>>> execution of the jobscript, not the prolog/epilog. Can you please
>>> post your queue definition?
>>
>> Here's the qconf -sq output:
>>
>> [root at ken1 testprog]# qconf -sq all.q
>> qname                 all.q
>> hostlist              @allhosts
>> seq_no                0
>> load_thresholds       np_load_avg=1.75
>> suspend_thresholds    NONE
>> nsuspend              1
>> suspend_interval      00:05:00
>> priority              0
>> min_cpu_interval      00:05:00
>> processors            UNDEFINED
>> qtype                 BATCH INTERACTIVE
>> ckpt_list             NONE
>> pe_list               make mpich2_mpd
>> rerun                 FALSE
>> slots                 4,[ken1=2],[ken2=2]
>> tmpdir                /tmp
>> shell                 /bin/csh
>
> shell /bin/sh
>
>> prolog                NONE
>> epilog                NONE
>> shell_start_mode      posix_compliant
>
> shell_start_mode unix_behavior
>
> will honor the first line of the script for the requested shell.
> Another option would be to submit all jobs with "-S /bin/ksh". Or set
> "shell /bin/ksh" above and leaver the setting of posix_compliant.
> Please have a look at "man queue_conf" for the exact behavior of
> these settings.
>
> -- Reuti
>>
>
> ------------------------------------------------------
> http://gridengine.sunsource.net/ds/viewMessage.do?dsForumId=38&dsMessageId=125853
>
> To unsubscribe from this discussion, e-mail: [users-unsubscribe at gridengine.sunsource.net].
>

------------------------------------------------------
http://gridengine.sunsource.net/ds/viewMessage.do?dsForumId=38&dsMessageId=125857

To unsubscribe from this discussion, e-mail: [users-unsubscribe at gridengine.sunsource.net].



More information about the gridengine-users mailing list