[GE users] Does SGE knows the real memory consumption of a host at scheduling time?

Goncalo Borges goncalo at lip.pt
Wed Jan 18 10:10:59 GMT 2006


Hi everybody,
Thanks for all the suggestions you gave me.
I will try to implement them. I gess I will came back to the mailing 
list with difficulties with the exact implementation. 

I know that the behaviour I want exists in other systems, as Condor,
and if it also works in SGE, is a nice way to convince my collegues to use 
SGE in our cluster.

Cheers
	Goncalo




On Tue, 17 Jan 2006, Reuti wrote:

> Hi Goncalo,
> 
> what you are asking for is some kind of magic. If the user doesn't know the
> estimated memory consumption for his job during its lifetime in the cluster,
> how could SGE guess it? SGE can of course track the current memory
> consumption, but if this changes over time the second job B might start, and
> later on A and B want to use each the whole memory of the machine, which will
> result in heavy swapping. You can try of course:
> 
> Am 17.01.2006 um 21:07 schrieb Goncalo Borges:
> 
> > 
> > Hi there,
> > 
> > I have the following question:
> > 
> > Let's imagine that I have an execution host associated to 2 different
> > queues. Suppose that there is one job (job A) already running in queue A
> > of this exec host and that it is spending all the available memory.
> > 
> > Imagine now that another user (who doesn't know nothing about Job A)
> > wants to submit a job (job B) in the same exec host but in a
> > different queue (queue B).
> > 
> > 1) Does SGE automatically detects that the memory consumption is very
> > high, and thus, it will not execute job B (although queue B is free)?
> > 
> 
> use one of the memory values:
> 
> mem_free
> virtual_free
> 
> as load_thresholds, and if its amount falls below a to be specified minimum
> the queue will be put in alarm-state, i.e. disabled, and won't accept any new
> jobs until enough memory is free again. But here already, the specified amount
> is your assumption, that this will be enough for the second job to run in. And
> if A uses more memory over time...
> 
> But instead of using load_thresholds, you could make mem_free or virtual_free
> consumable (maybe with a default value for standardf jobs) and attach it to
> each host with a value of around (built-in memory - 100M) in the
> complex_values. This I did in my clusters. So the users have to specify an
> amount of memory for their job (or get the default granted). But SGE will now
> use the computed virtual_free value *or* the value from the built-in
> load-sensor - whichever is lower.
> 
> And if they use a little bit more memory than estimated, their jobs won't be
> killed, like it would be the case with h_vmem. But at least  this
> oversubscription will be taken into account for scheduling new jobs to this
> host.
> 
> HTH - Reuti
> 
> > 2) If this is not the case, do you know how can I can enable this
> > behaviour?
> > 
> > I know that there is the possibility that the user submitting job B can
> > request a given ammount of memory. If this memory is not available, than
> > the job will not execute. However, this procedure takes for granted
> > the fact that the user must know the memory that his job will need, a
> > non-trivial assumption. Therefore, i would prefer somekind of automatic
> > procedure.
> > 
> > Maybe it can be implemented using a load_sensor ?!
> > 
> > Any help?
> > 
> > Thanks in advance
> > Cheers
> >      Goncalo
> > 
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
> > For additional commands, e-mail: users-help at gridengine.sunsource.net
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
> For additional commands, e-mail: users-help at gridengine.sunsource.net
> 

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
For additional commands, e-mail: users-help at gridengine.sunsource.net




More information about the gridengine-users mailing list