[GE users] Trouble with load thresholds

reuti reuti at staff.uni-marburg.de
Mon Mar 8 22:42:48 GMT 2010

Am 08.03.2010 um 19:44 schrieb opoplawski:

> Using gridengine 6.2u5.  I've got a couple machines in our grid that
> have lot of interactive use so I limit grid access with a load  
> threshold
> of np_load_avg = 1, a suspend threshold of 1.3 or 1.02 with load
> adjustment for np_load_avg of 1.
> However, my 8 core machines are getting woefully underused.
> Two different cases:
> hobbes, suspend threshold of 1.3.  top shows load average has been
> around 3.7-4.3.  I generally only see one or two jobs at a time ever  
> get
> run one it.  qstat -j shows:
>                             queue instance  
> "all.q at hobbes.cora.nwra.com"
> dropped because it is overloaded: np_load_avg=1.003750 (= 0.541250 +  
> 1.0
> * 3.700000 with nproc=8) >= 1
> I would have expected about 3-4 jobs on it.  I can't make any sense of
> what the above line is supposed to be telling me.
> josiah, suspend threshold of 1.02.  steady load average about 3.3.
> got 3 jobs on it, but qstat alternates with:
>                             queue instance
> "compute.q at josiah.cora.nwra.com" dropped because it is overloaded:
> np_load_avg=1.016250 (= 0.425000 + 1.0 * 4.730000 with nproc=8) >= 1

4.73 is not the actual load, but the sum of all decaying factores (see  
below) of all jobs on this exechost where it still applies.

> and
>                             queue instance
> "compute.q at josiah.cora.nwra.com" is in suspend alarm:
> np_load_avg=1.026250 (= 0.425000 + 1.0 * 4.810000 with nproc=8) >=  
> 1.02

Same here for 4.81, it should decrease over time.

> Some thoughts -
> - These are very short jobs, just a few seconds of cpu time, must be
> playing havoc with load adjustments?  Does load adjustment get removed
> when a job ends?

Please have a look at `man sched_conf` for the exact explanation of  
the behavior. As SGE usually uses the 5 minutes average load (the  
middle ouput of 1/5/15 avg load in `uptime`), it will put some  
artificial load on the machine to avoid oversubscription by the delay  
the np_load_avg faces in the 5 minutes average and decay this over the  
specified time. When you have short running jobs this should just  
reflect the actual load.

> - Why are load adjustments used to suspend jobs?  I think that should
> only use the actual load of the machine.

They aren't. If they would, you could check this with `qstat -f` and  
`qstat -explain A`. The adjusted load will only be used to allow or  
disallow the scheduling of jobs. As the adjusted load value should  
reflect the real load after the decay time, the scheduler is looking  
ahead: if this job would really reach the estimated load, it would  
suspend a job in the near future. Hence the job isn't scheduled.

So the theory, but I also notice some jobs being pushed into T state w/ 
o the queue itself being in suspend alarm state A - confusing. Even a  
single job can push itself into suspend state and will resume and so  
on... Can you please file a bug?

-- Reuti

> -- 
> Orion Poplawski
> Technical Manager                     303-415-9701 x222
> NWRA/CoRA Division                    FAX: 303-415-9702
> 3380 Mitchell Lane                  orion at cora.nwra.com
> Boulder, CO 80301              http://www.cora.nwra.com
> ------------------------------------------------------
> http://gridengine.sunsource.net/ds/viewMessage.do?dsForumId=38&dsMessageId=247546
> To unsubscribe from this discussion, e-mail: [users-unsubscribe at gridengine.sunsource.net 
> ].


To unsubscribe from this discussion, e-mail: [users-unsubscribe at gridengine.sunsource.net].

More information about the gridengine-users mailing list