[GE users] mpi problems

Reuti reuti at staff.uni-marburg.de
Wed Apr 30 17:35:57 BST 2008


Hi,

Am 30.04.2008 um 17:53 schrieb Roberta Gigon:

> Very strange happenings here indeed.
>
> I made the changes you suggested and now the job will run if I set - 
> pe mpi 2 and -np 2, but fails on any more than 2 nodes.  If I run  
> the same job independent of SGE, it still runs fine regardless of  
> the -np set.
>
> We have the mpd master running all the time on the cluster head  
> node and the mpd slaves running all the time on the nodes in mpi.q.

I could assume, as the "initial" mpd daemon is running on the head  
node of the cluster, that the slave daemons simply don't know  
anything about how to contact the other slaves. The master-task of  
your parallel job might need to run always on the headnode, and  
instruct the mpd daemon on this node where to start slave processes.

You could try this, but requesting one master and n slave nodes is  
not available in SGE for now.

-- Reuti


> Puzzled,
> Roberta
>
> ---------------------------------------------------------------------- 
> -----------------------
> Roberta M. Gigon
> Schlumberger-Doll Research
> One Hampshire Street, MD-B253
> Cambridge, MA 02139
> 617.768.2099 - phone
> 617.768.2381 - fax
>
> This message is considered Schlumberger CONFIDENTIAL.  Please treat  
> the information contained herein accordingly.
>
> -----Original Message-----
> From: Reuti [mailto:reuti at staff.uni-marburg.de]
> Sent: Monday, April 28, 2008 6:17 PM
> To: users at gridengine.sunsource.net
> Subject: Re: [GE users] mpi problems
>
> Hi Roberta,
>
> Am 28.04.2008 um 19:16 schrieb Roberta Gigon:
>
>> I have also tried using -machinefile $TMPDIR/machines in the script
>> file and get the same result.
>
> for the mpd-method you would need to modify the
> PeHostfile2MachineFile subroutine in startmpi.sh:
>
>     cat $1 | while read line; do
>        # echo $line
>        host=`echo $line|cut -f1 -d" "|cut -f1 -d"."`
>        nslots=`echo $line|cut -f2 -d" "`
>        echo $host:$nslots
>     done
>
>> We have been using the mpd method.
>
> There is no tight integration for the mpd-method (therefore it's not
> in the Howto), as the daemons would always start without being
> controlled by SGE. But even if you would like only a loose
> integration: how is this working with PBS: you are starting a ring of
> mpds per job according to the MPICH2 manual page18? (How it this
> working with two different jobs on one node? mpdallexit would also
> shutdown the daemons of the other job as it has no args AFAICS.)
>
>> The master is on the head node of the cluster and the nodes are all
>> mpd "slaves".
>
> Or do you have the ring simply always active across the complete
> cluster?
>
>>   I didn't see instructions for the mpd method in the how-to.  The
>> program we are using with MPICH-2 is MCNP from Los Alamos National
>> Labs; I'm not sure if it works with any of the other methods.
>
> The MCNP is not publicy available. Do you have the source and could
> compile it just with another mpicc or so?
>
> -- Reuti
>
>
>>
>> Thanks!
>> Roberta
>>
>> P.S.  Regarding the bear72/bear75 confusion... I cut and pasted the
>> wrong error file entry... in reality, it is consistent.
>>
>> --------------------------------------------------------------------- 
>> -
>> -----------------------
>> Roberta M. Gigon
>> Schlumberger-Doll Research
>> One Hampshire Street, MD-B253
>> Cambridge, MA 02139
>> 617.768.2099 - phone
>> 617.768.2381 - fax
>>
>> This message is considered Schlumberger CONFIDENTIAL.  Please treat
>> the information contained herein accordingly.
>>
>>
>> -----Original Message-----
>> From: Reuti [mailto:reuti at staff.uni-marburg.de]
>> Sent: Monday, April 28, 2008 12:07 PM
>> To: users at gridengine.sunsource.net
>> Subject: Re: [GE users] mpi problems
>>
>> Hi,
>>
>> Am 28.04.2008 um 17:23 schrieb Roberta Gigon:
>>
>>> I'm having a few issues with getting MPICH-2  to work under SGE. I
>>> have an mpi job that works just fine with PBS and outside of SGE,
>>> so I'm pretty confident in saying that MPI itself is working.
>>
>> the included $SGE_ROOT/mpi is only for MPICH(1). There is a Howto for
>> MPICH2:
>>
>> http://gridengine.sunsource.net/howto/mpich2-integration/mpich2-
>> integration.html
>>
>> Just take note, that MPICH2 can be compiled in at least 4 different
>> ways and the compilation (of your application) must use the
>> appropriate mpirun and SGE PE. Which type of startup do you want to
>> use?
>>
>> Anyway: you have no -machinefile or similar in your mpirun call,
>> hence all will be local. And: how it's getting from bear72 to bear75
>> - do you have any predefined mpd.hosts which could trigger this?
>>
>> -- Reuti
>>
>> PS: Please try the latest 1.0.7 of MPICH2 (although your 1.0.4p1
>> should be fine), at least 1.0.6p1 is broken.
>>
>>
>>>
>>> Some background:
>>> I have a pe called mpi with these characteristics:
>>>
>>> [root at bear ~]$ qconf -sp mpi
>>> pe_name           mpi
>>> slots             999
>>> user_lists        NONE
>>> xuser_lists       NONE
>>> start_proc_args   /opt/sge/mpi/startmpi.sh -catch_rsh $pe_hostfile
>>> stop_proc_args    /opt/sge/mpi/stopmpi.sh
>>> allocation_rule   $round_robin
>>> control_slaves    FALSE
>>> job_is_first_task TRUE
>>> urgency_slots     min
>>>
>>> I have a queue called mpi.q with 6 dual processor nodes (12 slots)
>>>
>>> I submit the job like this:  qsub -q mpi.q -pe mpi 6 -cwd ./
>>> sbt034.csh
>>>
>>> sbt034.csh:
>>> #! /bin/tcsh
>>>
>>> #$ -q mpi.q
>>> #$ -j y
>>> #$ -o testSGE2.out
>>> #$ -N testSGE2
>>> #$ -cwd
>>> #$ -pe mpi 6
>>>
>>> echo running...
>>> echo $TMPDIR
>>> /usr/local/mpich2-1.0.4p1-pgi-k8-64/bin/mpirun -np 6 /people8/tzhou/
>>> mcnprun/SUN/bin/mcnp 5j.mpi i=sbt034 wwinp=sbwwmx05 eol
>>> echo done!
>>>
>>> qstat says:
>>>
>>> tzhou at bear[162] qstat
>>> job-ID  prior   name       user         state submit/start at
>>> queue                          slots ja-task-ID
>>> -------------------------------------------------------------------- 
>>> -
>>> -
>>> -------------------------------------------
>>>    6862 0.56000 testSGE2   tzhou        r     04/28/2008 10:52:48
>>> mpi.q at bear72.cl.slb.com            6
>>>
>>> error file says:
>>> master starting       5 tasks with       1 threads each  **/**/08
>>> **:**:10
>>>  master sending static commons...
>>>  master sending dynamic commons...
>>>  master sending cross section data...
>>> PGFIO/stdio: No such file or directory
>>> PGFIO-F-/OPEN/unit=32/error code returned by host stdio - 2.
>>>  In source file msgtsk.f90, at line number 116
>>> PGFIO/stdio: No such file or directory
>>> PGFIO-F-/OPEN/unit=32/error code returned by host stdio - 2.
>>>  In source file msgtsk.f90, at line number 116
>>> rank 4 in job 4  bear75.cl.slb.com_47485   caused collective abort
>>> of all ranks
>>>   exit status of rank 4: killed by signal 9
>>> done!
>>>
>>> The $TMPDIR gets set properly...
>>>
>>> Any thoughts on what might be happening here?
>>>
>>> Many thanks,
>>> Roberta
>>>
>>> -------------------------------------------------------------------- 
>>> -
>>> -
>>> -----------------------
>>> Roberta M. Gigon
>>> Schlumberger-Doll Research
>>> One Hampshire Street, MD-B253
>>> Cambridge, MA 02139
>>> 617.768.2099 - phone
>>> 617.768.2381 - fax
>>>
>>> This message is considered Schlumberger CONFIDENTIAL.  Please treat
>>> the information contained herein accordingly.
>>>
>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
>> For additional commands, e-mail: users-help at gridengine.sunsource.net
>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
>> For additional commands, e-mail: users-help at gridengine.sunsource.net
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
> For additional commands, e-mail: users-help at gridengine.sunsource.net
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
> For additional commands, e-mail: users-help at gridengine.sunsource.net


---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
For additional commands, e-mail: users-help at gridengine.sunsource.net




More information about the gridengine-users mailing list