[GE users] mpich2 tight integration not working

reuti reuti at staff.uni-marburg.de
Mon Mar 9 22:01:30 GMT 2009


Am 09.03.2009 um 22:10 schrieb kennethsdsc:

>> shell_start_mode unix_behavior
>
> That works!  That's the behavior I expect from job scripts.
>
> I have to say, your writeup on mpich2/sge tight integration
> was one of the better pieces of SGE documentation I've seen.

Thx - I will put the corrected archive online tomorrow (CET). But I  
just also entered an issue, as for the pseudo variables to the -o end  
-e options to qsub it's indeed $TASK_ID w/o the SGE prefix.

-- Reuti


> Thanks,
> Kenneth
>
>
> On Mon, 9 Mar 2009, reuti wrote:
>
>> Date: Mon, 9 Mar 2009 22:01:07 +0100
>> From: reuti <reuti at staff.uni-marburg.de>
>> Reply-To: users <users at gridengine.sunsource.net>
>> To: users at gridengine.sunsource.net
>> Subject: Re: [GE users] mpich2 tight integration not working
>>
>> Am 09.03.2009 um 20:56 schrieb kennethsdsc:
>>
>>> On Mon, 9 Mar 2009, reuti wrote:
>>>
>>>> Date: Mon, 9 Mar 2009 20:34:56 +0100
>>>> From: reuti <reuti at staff.uni-marburg.de>
>>>> Reply-To: users <users at gridengine.sunsource.net>
>>>> To: users at gridengine.sunsource.net
>>>> Subject: Re: [GE users] mpich2 tight integration not working
>>>>
>>>> Am 09.03.2009 um 19:46 schrieb kennethsdsc:
>>>>
>>>>> A couple other issues:
>>>>>
>>>>> - I had to specify task count in my qsub line:
>>>>> qsub -t 1-4:4 -l h_rt=18:00:00 -q all.q -pe mpich2_mpd 4  
>>>>> testjob.sh
>>>>>
>>>>> - I had to use SGE_TASK_ID, instead of TASK_ID in mpich2_mpd.sh:
>>>>> #export MPICH2_ROOT=/usr/local/apps/sge/mpich2/install
>>>>> #export PATH=$MPICH2_ROOT/bin:$PATH
>>>>> #export MPD_CON_EXT="sge_$JOB_ID.$TASK_ID"
>>>>> setenv MPICH2_ROOT /usr/local/apps/sge/mpich2/install
>>>>> setenv PATH $MPICH2_ROOT/bin:$PATH
>>>>> setenv MPD_CON_EXT "sge_$JOB_ID.$SGE_TASK_ID"
>>>>>
>>>>> It looks like SGE is using csh to execute the file, rather
>>>>> than using #!/bin/ksh.  Not sure if that's a configuration  
>>>>> issue on
>>>>> my part?
>>>>
>>>>
>>>> For the prolog/epilog it should just exec the specified  
>>>> binaries. You
>>>> are on which platform? /bin/bash is available?
>>>
>>> Sorry, I was unclear.  The job script, mpich2_mpd.sh, gets executed
>>> by csh, rather than its #!/bin/ksh line.  That makes the export  
>>> lines
>>> not work.
>>
>> The original was /bin/sh, but /bin/ksh should work as well.
>>
>>>
>>>>
>>>> The queue settings for the interpreter should only affect the
>>>> execution of the jobscript, not the prolog/epilog. Can you please
>>>> post your queue definition?
>>>
>>> Here's the qconf -sq output:
>>>
>>> [root at ken1 testprog]# qconf -sq all.q
>>> qname                 all.q
>>> hostlist              @allhosts
>>> seq_no                0
>>> load_thresholds       np_load_avg=1.75
>>> suspend_thresholds    NONE
>>> nsuspend              1
>>> suspend_interval      00:05:00
>>> priority              0
>>> min_cpu_interval      00:05:00
>>> processors            UNDEFINED
>>> qtype                 BATCH INTERACTIVE
>>> ckpt_list             NONE
>>> pe_list               make mpich2_mpd
>>> rerun                 FALSE
>>> slots                 4,[ken1=2],[ken2=2]
>>> tmpdir                /tmp
>>> shell                 /bin/csh
>>
>> shell /bin/sh
>>
>>> prolog                NONE
>>> epilog                NONE
>>> shell_start_mode      posix_compliant
>>
>> shell_start_mode unix_behavior
>>
>> will honor the first line of the script for the requested shell.
>> Another option would be to submit all jobs with "-S /bin/ksh". Or set
>> "shell /bin/ksh" above and leaver the setting of posix_compliant.
>> Please have a look at "man queue_conf" for the exact behavior of
>> these settings.
>>
>> -- Reuti
>>>
>>
>> ------------------------------------------------------
>> http://gridengine.sunsource.net/ds/viewMessage.do? 
>> dsForumId=38&dsMessageId=125853
>>
>> To unsubscribe from this discussion, e-mail: [users- 
>> unsubscribe at gridengine.sunsource.net].
>>
>
> ------------------------------------------------------
> http://gridengine.sunsource.net/ds/viewMessage.do? 
> dsForumId=38&dsMessageId=125857
>
> To unsubscribe from this discussion, e-mail: [users- 
> unsubscribe at gridengine.sunsource.net].

------------------------------------------------------
http://gridengine.sunsource.net/ds/viewMessage.do?dsForumId=38&dsMessageId=125918

To unsubscribe from this discussion, e-mail: [users-unsubscribe at gridengine.sunsource.net].



More information about the gridengine-users mailing list