[GE users] Queue based on loadavg

reuti reuti at staff.uni-marburg.de
Tue Mar 30 11:12:11 BST 2010


Hi,

Am 30.03.2010 um 02:48 schrieb dasf:

> Is is possible to create a queue in such a way that serial jobs will be assigned based on the load average of each node? I mean, each new job will start on nodes with the smallest load average.
> 
> Let me explain why. I am working on a cluster with 80 nodes, each one having 4 processors. 
> Users suggested me that if, say, 100 serial jobs are submitted, they should load all nodes first (80 jobs) and then start to load a second job on each node, instead of loading 4 jobs on 25 nodes and leaving 55 nodes free.
> 
> I thought about creating queues with just one slot and sorting them in a way that they would load all nodes the way I want. But this sounds like a brute force approach. I wonder if there is a more elegant way of asking SGE to load jobs according to the load average of the compute nodes.

this behavior will only show up when all jobs are scheduled at the same time. Over time the load will increase and as SGE will take the machine with the least load automatically you would get the intended behavior. If you submit a bunch of jobs in a short time, you must fake some load on the machine though (as SGE is using the 5 minute average load by default), so that the next machine will be taken. This can be achieved by three entries on the scheduler configuration (`man sched_conf` for details):

$ qconf -msconf
...
job_load_adjustments              np_load_avg=1
load_adjustment_decay_time        0:7:30
load_formula                      np_load_avg


There exists another setup option to achieve the same result, which is called "use least used host first":

http://blogs.sun.com/sgrell/entry/grid_engine_scheduler_hacks_least

Often it's personal taste which method is preferred, but sometimes the problem arises "which setup can be combined with other necessary settings". 

-- Reuti


> Thank you in advance!
> 
> Demetrio
> 
> ------------------------------------------------------
> http://gridengine.sunsource.net/ds/viewMessage.do?dsForumId=38&dsMessageId=251777
> 
> To unsubscribe from this discussion, e-mail: [users-unsubscribe at gridengine.sunsource.net].

------------------------------------------------------
http://gridengine.sunsource.net/ds/viewMessage.do?dsForumId=38&dsMessageId=251820

To unsubscribe from this discussion, e-mail: [users-unsubscribe at gridengine.sunsource.net].



More information about the gridengine-users mailing list