[GE users] PE and sort by load question

dev dev_hyd2001 at yahoo.com
Thu Jun 18 10:22:31 BST 2009

    [ The following text is in the "iso-8859-1" character set. ]
    [ Your display is set for the "ISO-8859-10" character set.  ]
    [ Some special characters may be displayed incorrectly. ]


         I'm having some trouble in getting SGE to properly schedule parallel jobs on least loaded hosts even though the scheduler has been configured to do sort_by_load. However, my parallel environment is set to $fill_up. As an example

I have a queue with a parallel environment attached to it and this parallel environment has an allocation rule of $fill_up set. All the nodes on the queue have 8 slots each.

Explanation of scenario

step 0

The cluster is completely free, no running jobs, and the load on the machines is 0.00

step 1

I start a parallel job requesting for 12 slots. It lands on two nodes occupying 8 slots on one node and 4 on another.

step 2

I start another parallel job using the same PE and I expect that it take some other free nodes whose load is 0.00 as I've set the scheduler to do a sort by load. However, what I see is that there is no guarantee that SGE chooses a free node, it randomly sometimes even chooses a node where my first job is occupying 4 slots even though its load is higher than the free nodes.

Is this expected behavior.

Does the setting of the allocation rule as $fill_up have anything to with it ?

Is the scheduler just ignoring my sort by load setting because I am running a parallel job ?

I would ideally require parallel jobs started to always try to first go onto machines which have the least load.

Any ideas ?


More information about the gridengine-users mailing list