[GE users] MPI Job Execution

rajesh britto britto.gridlab at gmail.com
Sat Sep 27 08:01:51 BST 2008


    [ The following text is in the "ISO-8859-1" character set. ]
    [ Your display is set for the "ISO-8859-10" character set.  ]
    [ Some special characters may be displayed incorrectly. ]

hi ,

 After export the variables i still get the same error.. in my job output
file.. when i check for the error file the content is empty...

[sgeadmin at slaserver ~]$ cat cpi.o30

Got 2 slots.

Cannot read /tmp/30.1.all.q/machines.
Looked for files with extension LINUX in
directory /usr/local/mpich-1.2.6/util/machines .

---RB

On Fri, Sep 26, 2008 at 3:53 PM, Reuti <reuti at staff.uni-marburg.de> wrote:

> Am 26.09.2008 um 11:55 schrieb rajesh britto:
>
>>     thanks chris and reuti for your information. now i am able to run an
>> mpi program.. my mpi job has been successfully submitted but when i check my
>> job output it shows the following error.
>>
>> [sgeadmin at slaserver ~]$ cat mpi.o126
>> Warning: no access to tty (Bad file descriptor).
>> Thus no job control in this shell.
>>
>>  This is the output from the c shell. If you prefer the shell mentioned in
> the jobscript, the queue definition has to be changed to read
> "shell_start_mode      unix_behavior".
>
> http://gridengine.sunsource.net/howto/commonproblems.html
>
>> Got 2 slots.
>> Cannot read /usr/local/mpich-1.2.6/util/machines/machine.LINUX.
>> Looked for files with extension LINUX in
>> directory /usr/local/mpich-1.2.6/util/machines .
>>
>>  #!/bin/sh
> echo "Got $NSLOTS slots."
> export MPICH_PROCESS_GROUP=no
> export P4_RSHCOMMAND=rsh
> mpirun -np $NSLOTS -machinefile $TMPDIR/machines ...
>
> -- Reuti
>
>  regards,
>>
>> RB
>>
>> On Fri, Sep 26, 2008 at 3:15 PM, Reuti <reuti at staff.uni-marburg.de>
>> wrote:
>> Hi,
>>
>> Am 26.09.2008 um 07:03 schrieb rajesh britto:
>>
>>
>> Hi,
>>
>>  Thanks for the information.. i implemented mpich implementation.. can
>> anyone give me how to submit mpi job using mpich and sge..
>>
>> what do you mean in detail: you setup a parallel environment already or
>> just a plain mpich(1) installation? As Chris pointed out, there are
>> documents in $SGE_ROOT/mpi/ to get started.
>>
>> Additional hints for a Tight Integration you will find here:
>>
>> http://gridengine.sunsource.net/howto/mpich-integration.html
>>
>> -- Reuti
>>
>>
>>
>> On Thu, Sep 25, 2008 at 7:59 PM, Chris Dagdigian <dag at sonsorol.org>
>> wrote:
>> Hello,
>>
>> This is what I'd recommend:
>>
>> (1) Determine what sort of MPI environment you have or need to install
>> (there are many implementations of the MPI standard)
>> (1.5) If you don't know what MPI to start with, start with OpenMPI from
>> openmpi.org as that works beautifully with SGE in tight integration mode
>> (2) Set up and install MPI
>> (3) Compile the example cpi.c program using MPICC
>> (4) Run your MPI job outside of Grid Engine using passwordless SSH to the
>> nodes
>>
>> The basic idea here is that integrating MPI with Grid Engine is far, far
>> easier if you are first able to validate for yourself that MPI works on its
>> own. I've seen many "SGE can't handle MPI" trouble tickets where the actual
>> problem was with the MPI install and not Grid Engine
>>
>> Once you have MPI working outside of SGE and ideally with a real world
>> application then you can move on to SGE work ...
>>
>> Resources:
>> (1) example scripts can be found in $SGE_ROOT/mpi/
>> (2) The documentation on wikis.sun.com for Grid Engine covers PE and
>> parallel environment stuff well
>> (3) A few google searches on "tight mpi integration with SGE" or similar
>> will show you other HOWTO methods
>>
>> If you use openmpi and compile it with the "--with-sge" option then
>> OpenMPI will automatically detect that it is running under Grid Engine and
>> will do the right thing. This is currently (in my opinion) the fastest and
>> easiest way to get a working tight MPI integration into SGE at this time.
>>
>> For the difference between "tight" and "loose" PE integration and why
>> tight is better if you can achieve it, these links may help:
>>
>> http://gridengine.info/2005/09/19/parallel-environments-pes-loose-
>> vs-tight-integration
>>
>> -Chris
>>
>>
>>
>>
>> On Sep 25, 2008, at 9:18 AM, rajesh britto wrote:
>>
>> hi,
>>  i have installed sge6.2 in a cluster and it works fine for sequential
>> job.. when i submit an mpi job the job gets submitted and it goes to the
>> waiting state.. (qw state)
>>  i need to know how to set up parallel environment to run a mpi job, if
>> any one has any document it will be usefull..
>>  thanks in advance..
>>
>> with regards,
>> RB
>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
>> For additional commands, e-mail: users-help at gridengine.sunsource.net
>>
>>
>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
>> For additional commands, e-mail: users-help at gridengine.sunsource.net
>>
>>
>>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
> For additional commands, e-mail: users-help at gridengine.sunsource.net
>
>



More information about the gridengine-users mailing list