[GE users] Reserving memory question.

mlelstv mlelstv at serpens.de
Sat Dec 5 10:00:40 GMT 2009


On Fri, Dec 04, 2009 at 11:40:57PM -0800, gutnik wrote:

> Great. That ... seems to do just what I wanted. Why is it necessary
> to specify the complex? SGE clearly already knows how much memory
> is available.

There is a load value "mem_free" which happens to represent the currently
free memory on a specific host.

When you ask for a resource "mem_free" then this is checked
against the load value.


There might also be a complex variable with the same name "mem_free"
configured for a host.

If you set this variable then the resource request for "mem_free"
is checked against the minimum of the load value and the complex
variable.

Since the complex variable is also defined as consumable its value
is reduced by the requested value as long as the specific job is
running.

Example:

    qconf -se HOST

shows you the original (configured) value of the complex
variable "mem_free" and the load value "mem_free". E.g.

hostname              node12345
load_scaling          NONE
complex_values        slots=5,mem_free=15.7G
load_values           arch=lx24-amd64,num_proc=4,mem_total=16052.765625M, \
                      swap_total=2055.179688M,virtual_total=18107.945312M, \
                      load_avg=0.000000,load_short=0.000000, \
					  load_medium=0.000000,load_long=0.000000, \
					  mem_free=15895.773438M,swap_free=2042.644531M, \

and
    qhost -F mem_free HOST

shows you the computed minimum of the load value and the
reduced complex variable:

node12345               lx24-amd64      4  0.00   15.7G  158.0M    2.0G   12.5M
    Host Resource(s):      hl:mem_free=15.521G

When I start a job with a request like
    echo "sleep 300" | qsub -l mem_free=8G -j y -o /dev/null
then the computed minimum changes:

node12345               lx24-amd64      4  0.00   15.7G  158.1M    2.0G   12.5M
    Host Resource(s):      hc:mem_free=7.700G

to the configured value - requested value. And when the job completes
it changes back again:

node12345               lx24-amd64      4  0.00   15.7G  157.5M    2.0G   12.5M
    Host Resource(s):      hl:mem_free=15.523G

You see that the reported Host Resource is smaller than the configured
value because some memory is used by the system and the load value 
is therefore smaller.

I don't think there is a way to report the reduce value of the the complex
alone. But then nobody cares about this value, only the reported minimum is
used as the 'Host Resource' and is compared to the requested resource.

Some people might configure the complex variable "mem_free" to be
slightly less than "mem_total". As you can see from the example
there is some fuzz caused by memory used by the system that could
be accounted by such a configuration.


Now, why does SGE not automatically provide a complex variable
and initialize it this way. Probably because it knows that
a load value and a complex variable of the same name should be
treated this way.
But it doesn't know that "mem_free" and "mem_total" are related
and as described above, some people prefer slightly different
values.



Greetings,
-- 
                                Michael van Elst
Internet: mlelstv at serpens.de
                                "A potential Snark may lurk in every tree."

------------------------------------------------------
http://gridengine.sunsource.net/ds/viewMessage.do?dsForumId=38&dsMessageId=231645

To unsubscribe from this discussion, e-mail: [users-unsubscribe at gridengine.sunsource.net].



More information about the gridengine-users mailing list