[GE users] contiguous blocks of nodes

Reuti reuti at staff.uni-marburg.de
Wed Nov 9 16:04:37 GMT 2005


Sacerdoti, Federico wrote:
> Is it possible to request an allocation of contiguous blocks of nodes
> (sorted hostname, IP addr, or seqno) for a parallel job? I have an MPI
> cluster where network performance is greatly affected by proximity.
> Hostnames are 
> foo1
> foo2
> fooN
> and adjacent hostnames are most likely attached to the same infiniband
> switch. I would like to make requests such as
> qsub -pe mpi 64 -l contiguous-by-hostname  
> I have 'queue_sort_method seqno' but that does not seem to achieve what
> I am looking for.

to get this behavior you will need:

- one queue for each switch with all the nodes attached to the switch
- one PE for each switch, named e.g. mpi01, mpi02, ...
- attach the first PE to the first queue, and so on ...

Then you can submit your jobs with a wildcard like:

qsub -pe mpi* 16 script.sh

and SGE will select one PE, with this selected PE it can only use slots 
of the queue it's attached to. And so you get exactly always only nodes 
which are attached to one switch.

There are two RFEs to get this also done with only one PE and one queue 
by using hostgroups in a future version of SGE.

Cheers - Reuti

To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
For additional commands, e-mail: users-help at gridengine.sunsource.net

More information about the gridengine-users mailing list