[GE users] load_thresholds questions

chambon chambon at cc.in2p3.fr
Wed Nov 10 11:43:23 GMT 2010


Hello,

I try to understand the load_thresholds in details

I want to limit the dispatching of jobs to worker-nodes according to CPU load (and not by using nb_load_average) 

In my undestanding cpu is a predefined complex, which represents the percentage of CPU used (expressed between 0 and 100), so :
I defined a load_formula like  load_formula  cpu*0.01
and with 
 job_load_adjustments              cpu=20   (20 to express 20%)
 load_adjustment_decay_time        00:00:40

I also defined a queue (8 slots per worker-node) with load_thresholds with = 90 (90 to express 90%)
>  qconf -sq demo.q | grep load_thresholds
load_thresholds       cpu=90

I first submit 5 jobs 
 qsub -l h=ccwalk38 -t 1-5:1 -q demo.q UseCPU.script

 cpu must be around 5 / 8  ~= 62%
 that's ok with qconf 
  qconf -se ccwalk38 | grep cpu
                      cpu=62.600000

Then , I submit 5  new jobs  (the previous ones are still running) 
 qsub -l h=ccwalk38 -t 1-5:1 -q demo.q UseCPU.script

only 2 new jobs get running, => ok
Looking at the pending jobs, I see :
qstat -s p -j ...
cheduling info:            queue instance "demo.q at ccwalk38.in2p3.fr" dropped because it is overloaded: cpu=127.600000 (= 62.600000 + 20 * 3.250000) >= 90


The question is : where does the 3.25 comes from ?

why it's not only 1.0 at the begin time, 0.5 at half time, etc. 
(load_adjustment is decayed linearly over time)

Best regards
Bernard CHAMBON

------------------------------------------------------
http://gridengine.sunsource.net/ds/viewMessage.do?dsForumId=38&dsMessageId=294479

To unsubscribe from this discussion, e-mail: [users-unsubscribe at gridengine.sunsource.net].



More information about the gridengine-users mailing list