[GE users] PE and sort by load question
dev_hyd2001 at yahoo.com
Thu Jun 18 10:22:31 BST 2009
[ The following text is in the "iso-8859-1" character set. ]
[ Your display is set for the "ISO-8859-10" character set. ]
[ Some special characters may be displayed incorrectly. ]
I'm having some trouble in getting SGE to properly schedule parallel jobs on least loaded hosts even though the scheduler has been configured to do sort_by_load. However, my parallel environment is set to $fill_up. As an example
I have a queue with a parallel environment attached to it and this parallel environment has an allocation rule of $fill_up set. All the nodes on the queue have 8 slots each.
Explanation of scenario
The cluster is completely free, no running jobs, and the load on the machines is 0.00
I start a parallel job requesting for 12 slots. It lands on two nodes occupying 8 slots on one node and 4 on another.
I start another parallel job using the same PE and I expect that it take some other free nodes whose load is 0.00 as I've set the scheduler to do a sort by load. However, what I see is that there is no guarantee that SGE chooses a free node, it randomly sometimes even chooses a node where my first job is occupying 4 slots even though its load is higher than the free nodes.
Is this expected behavior.
Does the setting of the allocation rule as $fill_up have anything to with it ?
Is the scheduler just ignoring my sort by load setting because I am running a parallel job ?
I would ideally require parallel jobs started to always try to first go onto machines which have the least load.
Any ideas ?
More information about the gridengine-users