[GE users] Resource allocation strategies
dag at sonsorol.org
Fri Dec 12 12:14:49 GMT 2008
Short answers inline ...
On Dec 12, 2008, at 6:43 AM, Frank Olaf Sem-Jacobsen wrote:
> Dear community,
> I have spent a few hours going through various documents and the
> list archive to figure out an answer to the following question.
> How are the individual processors/nodes selected for allocation to a
> specific job? I know you can specify that you want certain types of
> architectures, a specific amount of memory, and so forth, but given
> there are a number of nodes that satisfy these requirements, how is
> or more specifically chosen
SGE works to find the best possible remote host to run your job on.
When more than one host meets this criteria the default behavior is to
sort the available host list by load average so that the end result in
default mode is that "sge will run your job on the least busy of the
most suitable machines"
You can override that final (sort on load) manually by assigning nodes
an integer based "sequence number". Then the final subsort is done on
your custom integers. This is one way to influence node selection.
> The reason for asking is that I would like to exploit locality in the
> processor allocation. This means that if I need 10 processors/nodes I
> would like them to be physically close to each other (with a short
> between them through the network). For instance, in a fat tree or a
> topology it is clearly defined which are the nearest nodes to each
> Is this in any way supported, or are available nodes chosen more or
> at random from a list?
Topology aware scheduling is possible using wildcard selectors on
parallel environments or hostgroups. It goes something like this:
Assume you have multiple racks of servers; each server is connected to
an in-cabinet aggregation switch.
You want to keep your parallel jobs within a single cabinet because
that means the application traffic never needs to leave the in-cabinet
This is done by:
(a) creating an SGE parallel environments to reflect the topology
units (MPICH1, MPICH2, MPICH3, etc.)
(b) Submitting the job using a wildcard selector:
$ qsub -pe MPICH* -np 32 ./my-parallel-application
It's a bit cleaner to do this with PEs because you can further control
the dispersal of tasks
You can also do it with physical hostgroups:
(a) Make hostgroups named by topology RACK1, RACK2, RACK3 etc.
(b) submit to a particular hostgroup set:
$ qsub -q all.q@@RACK*
I believe that above example would preferentially pack your job within
a cabinet (it's early AM here and I'm still thinking fuzzy ... )
> Any feedback is greatly appreciated, and if there is a document that
> describes this please let me know.
> Frank Olaf Sem-Jacobsen
> To unsubscribe from this discussion, e-mail: [users-unsubscribe at gridengine.sunsource.net
To unsubscribe from this discussion, e-mail: [users-unsubscribe at gridengine.sunsource.net].
More information about the gridengine-users