[GE users] nodes overloaded: processes placed on already full nodes
harry.mangalam at uci.edu
Fri Dec 17 19:11:52 GMT 2010
I may be either missing info or context, but we had this problem with
6.2 with overlapping Qs and it was resolved by explicitly specifying
the threshold for the Qs by setting np_load_avg to be just over 1.
$ qconf -sq long |grep load_thresholds
We often get overlapping Q execution hosts registering their
displeasure by entering an overload state, but only by a few
percentage points (1 compute process per core plus a few % due to
Almost all our Qs are overlapping due to competing requirements /
hardware and this seems to address that part of it fine. (tho I'd much
prefer to keep them separate for simplicity's sake).
On Wednesday 15 December 2010 08:23:06 steve_s wrote:
> On Dec 15 16:28 +0100, reuti wrote:
> > Am 15.12.2010 um 16:13 schrieb templedf:
> > > This is a known issue. When scheduling parallel jobs with 6.2
> > > to 6.2u5, the scheduler ignores host load.
> > Yep.
> > > This often results in jobs piling up
> > >
> > > on a few nodes while other nodes are idle.
> OK, good to know. We're running 6.2u3 here.
> I'm not sure if I get this right: Even if the load is ignored,
> doesn't SGE keep track of already given-away slots on each node? I
> always thought that this is the way jobs are scheduled in the
> first place (besides policies and all that, but that should have
> nothing to do with load or slots in this context).
> Given that SGE knows i.e. np_load_avg on each node, I thought we
> could circumvent the problem by setting np_load_avg to
> requestable=YES and then something like
> $ qsub -hard -l 'np_load_avg < 0.3' ...
> but this gives me
> "Unable to run job: denied: missing value for request
> "np_load_avg". Exiting."
> whereas using "=" or ">" works. I guess the reason is what is
> stated in complex(5):
> ">=, >, <=, < operators can only be overridden, when the new
> value is more restrictive than the old one."
> So, I cannot use "<". If that is the case, what can we do about it?
> Do we need to define a new complex attribute (say
> 'np_load_avg_less') along with a load_sensor or can we hijack
> np_load_avg in another way?
> > As far as I understood the problem, the nodes are oversubscribed
> > by getting more than 8 processes scheduled.
> > Did you change the host assignment to certain queues, while jobs
> > were still running? Maybe you need to limit the number total
> > slots per machine to 8 in an RQS or setting it for each host's
> > complex_values.
> No, we didn't change the host assignment.
> Sorry, but what do you mean by RQS? Did not see that in the
> documentation so far.
> > Another reason for virtual oversubscription: processes in state
> > "D" count as running and dispite the fact of the high load, all
> > is in best order.
> Oversubscribed nodes do not always run 16 instead of 8 processes,
> some only 14 or so. Nevertheless, the load is always almost
> exactly 16. As far as I can see, processes on these oversubscribed
> nodes (with > 8 processes) run with ~50% CPU load each.
> To unsubscribe from this discussion, e-mail:
> [users-unsubscribe at gridengine.sunsource.net].
Harry Mangalam - Research Computing, NACS, Rm 225 MSTB, UC Irvine
[ZOT 2225] / 92697 Google Voice Multiplexer: (949) 478-4487
MSTB=Bldg 415 (G-5 on <http://today.uci.edu/pdf/UCI_09_map_campus.pdf>
Lat/Long: 33.642025,-117.844414 (paste into google maps)
Like the autumn leaves / Our rights flutter to the ground /
So too, our trousers. <http://goo.gl/boJcT>
To unsubscribe from this discussion, e-mail: [users-unsubscribe at gridengine.sunsource.net].
More information about the gridengine-users