[GE users] request number of slots at a node

mlelstv mlelstv at serpens.de
Sun Aug 15 08:50:23 BST 2010

On Sat, Aug 14, 2010 at 09:27:18PM -0700, mrostaee wrote:
> my problem is:
> sometimes before submit a parallel job, some jos are distributed at cluster e.g. 1 slot at each host is free. when i submit a parrallel job i don't want get distributed slots from 5 hosts. i want my job be wait until 2 hosts be free completely. in other words, requesting e.g. 12 slots from 3 hosts( each host have 4 slots).

Gridengine will start a job as soon as possible, even if that means
that you get one slot on a large number of hosts each.

You can use a specific parallel environment that forces the job to
use 4 slots per host ("allocation_rule 4"). But there is no
adaptive behaviour, your 12 slot job will never start if you don't
have 3 hosts free. So on average your jobs will start later (assuming
the cluster is full).

You have to a find a compromise between execution speed of a single
job and throughput of the whole cluster.

> how to route a job to a specific host?

You can send the job to a specific queue instance ("-q QUEUE at HOST")
or you can ask for the resource hostname ("-l hostname=HOST").

The main reason for using a batch system however is to automatically
assign hosts to jobs so that your hardware resources are utilized
better. By forcing jobs to a specific host you give up this benefit.

If there is a reason to use a specific host (say it has more memory
or a larger disk), then it is better to define a resource (or
use an existing one) and let Gridengine select the host. This way
you could add and use other hosts without changing your job.

                                Michael van Elst
Internet: mlelstv at serpens.de
                                "A potential Snark may lurk in every tree."


To unsubscribe from this discussion, e-mail: [users-unsubscribe at gridengine.sunsource.net].

More information about the gridengine-users mailing list