[GE users] virtual_free vs. [sh]_vmem SGE 6

Andreas Haas Andreas.Haas at Sun.COM
Tue Dec 7 14:12:59 GMT 2004

Hi Ian,

On Mon, 6 Dec 2004, Jan Y. Brueckner wrote:

> Hash: SHA1
> Hello,
> I'm wondering why this is happening:
> the setup is: dual processor host(s) w. 4G RAM
> queue with 2 slots, s_vmem limit at 2G (h_vmem slightly above)
> ~  -> the limit is taken per slot; jobs with -l s_vmem > 2G do not get
> scheduled, 2 jobs can run in parallel. This is the behavior wanted!


> Now since there should be concurrent queues in future to overcommit
> cputime while using free memory; virtual_free reports 6 GB and swapping
> should be omitted one would set up a consumable for virtual_free, and
> specify this consumable with virtual_free=4G for each host or for the
> global target.
> The problem that arises with this configuration is, that now qsub -l
> vf=3G .... works and the job is scheduled to the queue that imposes a
> s_vmem=2G limit. This seems some kind of dangerous to me since the job
> gets certainly killed if it reaches the hard limit of 2.05G.
> Shouldn't there be some connection between vf an s_vmem?
> Naturally I can't define a limit for virtual_free in the queue. And
> setting the complex vf=2G for the queue would limit the whole queue to
> 2G, not just the slot.

I agree and in fact I already experimented somewhat with that idea
in the past. Please find in daemons/execd/load_avg.c the code snippet

#if 0
      /* identical to "virtual_free" */
      if (!getenv("SGE_MAP_LOADVALUE")) {
         sge_add_double2load_report(lpp, "s_vmem",
                     mem_info.mem_total + mem_info.swap_total,
                     uti_state_get_qualified_hostname(), "M");
         sge_add_double2load_report(lpp, "h_vmem",
                     mem_info.mem_total + mem_info.swap_total,
                     uti_state_get_qualified_hostname(), "M");

aimed on this. In addition I recommend use the 'virtual_total'
value of each hosts as host based capacity for h_vmem/s_vmem.
For that purpose h_vmem/s_vmem must be consumables.

I'd be excited to hear your experiences about use of such a set-up as
it would give us an occasion to decide if that should be adopted
to our regular builds and whether reporting virtual_free as
s_vmem/h_vmem is suited as install default. To make that a sound
feature certainly little more than just un-commenting the above
code snippet is required. At that stage however we primarily lack
experience and proponents.


To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
For additional commands, e-mail: users-help at gridengine.sunsource.net

More information about the gridengine-users mailing list