[GE users] SGE / MPI: how to target master process to a specific exec host

reuti reuti at staff.uni-marburg.de
Mon Feb 16 13:48:05 GMT 2009


Am 15.02.2009 um 03:12 schrieb nicoudem:

> I have an asymetric cluster, which has:
> - 1 exec_host (let's call it node0) itself a shared memory  
> multiprocessor endowed with 128 Go RAM
> - 16 typical blades making up the rest of execution hosts.
> and I would like to submit parallel jobs using openMPI in which the  
> MASTER process would run on node0, and all other processes would be  
> distributed among all other 16 nodes in a round-robin manner.
> how can I do that ?
> I have tried a "Multiple Process Multiple Data" scheme in the script:
>> cat asym.sh
> mpi -np 1 --host node0 master.bin : -np 16 slave.bin
> which I then sent to sge:
>> qsub -pe 17 make -cwd asym.sh
> sge understands the MPMD structure of the script, and executes  
> everything all right, except that it does NOT abide to the "--host  
> node0" request... and send the master process wherever it wants.

yes, SGE doesn't know anything about how to handle what's written in  
the jobscript. Hence the jobscript will run on one of the granted  
nodes. Not necessarily the one of a specific type.

But you can request different queues for both types:

$ qsub -masterq "*@node0" -q "*@@myslaves" -pe ompi 17 test.sh

(Which version of SGE are you using? It's not available in all  
versions though.)

-- Reuti

> any suggestions about how I can enforce that?
> more generally, how can I
> thanks
> nic
> ------------------------------------------------------
> http://gridengine.sunsource.net/ds/viewMessage.do? 
> dsForumId=38&dsMessageId=106085
> To unsubscribe from this discussion, e-mail: [users- 
> unsubscribe at gridengine.sunsource.net].


To unsubscribe from this discussion, e-mail: [users-unsubscribe at gridengine.sunsource.net].

More information about the gridengine-users mailing list