[GE users] nodes overloaded: processes placed on already full nodes

steve_s elcortogm at googlemail.com
Wed Dec 15 16:23:06 GMT 2010


On Dec 15 16:28 +0100, reuti wrote:
> Am 15.12.2010 um 16:13 schrieb templedf:
> 
> > This is a known issue.  When scheduling parallel jobs with 6.2 to 6.2u5, 
> > the scheduler ignores host load.
> 
> Yep.
> 
> >  This often results in jobs piling up 
> > on a few nodes while other nodes are idle.

OK, good to know. We're running 6.2u3 here.

I'm not sure if I get this right: Even if the load is ignored, doesn't
SGE keep track of already given-away slots on each node? I always
thought that this is the way jobs are scheduled in the first place
(besides policies and all that, but that should have nothing to do with
load or slots in this context).

Given that SGE knows i.e. np_load_avg on each node, I thought we could
circumvent the problem by setting np_load_avg to requestable=YES and
then something like

    $ qsub -hard -l 'np_load_avg < 0.3' ...

but this gives me 
    
    "Unable to run job: denied: missing value for request "np_load_avg".
     Exiting."

whereas using "=" or ">" works. I guess the reason is what is stated in
complex(5):
    
    ">=, >, <=, < operators can only be overridden, when the new value
     is more restrictive than the old one."

So, I cannot use "<". If that is the case, what can we do about it? Do
we need to define a new complex attribute (say 'np_load_avg_less') along
with a load_sensor or can we hijack np_load_avg in another way?

> As far as I understood the problem, the nodes are oversubscribed by getting more than 8 processes scheduled.

Exactly.
 
> Did you change the host assignment to certain queues, while jobs were still running? Maybe you need to limit the number total slots per machine to 8 in an RQS or setting it for each host's complex_values.

No, we didn't change the host assignment. 

Sorry, but what do you mean by RQS? Did not see that in the
documentation so far. 

> Another reason for virtual oversubscription: processes in state "D" count as running and dispite the fact of the high load, all is in best order.

Oversubscribed nodes do not always run 16 instead of 8 processes, some
only 14 or so. Nevertheless, the load is always almost exactly 16. As
far as I can see, processes on these oversubscribed nodes (with > 8
processes) run with ~50% CPU load each.

best,
Steve

------------------------------------------------------
http://gridengine.sunsource.net/ds/viewMessage.do?dsForumId=38&dsMessageId=305856

To unsubscribe from this discussion, e-mail: [users-unsubscribe at gridengine.sunsource.net].



More information about the gridengine-users mailing list