[GE users] vf, h_vmem or both?
neil at futurity.co.uk
Wed Jun 10 16:24:59 BST 2009
[ The following text is in the "Windows-1252" character set. ]
[ Your display is set for the "ISO-8859-10" character set. ]
[ Some special characters may be displayed incorrectly. ]
Many thanks for this Hristo.
Having real life results is always a great help.
I?ll give it a try shortly.
From: icaci [mailto:hristo at mc.phys.uni-sofia.bg]
Sent: 10 June 2009 10:40
To: users at gridengine.sunsource.net
Subject: Re: [GE users] vf, h_vmem or both?
We use h_vmem for scheduling in our cluster installation. We've also set h_vmem into each execution node's complex_values attribute, gave it a default value of 2G in SGE's complexes table and made it a consumable. It works fine and SGE kills jobs that run over the user specified h_vmem value but there are some gotchas:
- h_vmem limits the virtual memory consumption and in certain tasks that could be way more than the physical memory used. Thus users have to specify more memory than really needed (especially for OpenMPI jobs that use shared memory segments for IPC) and so we've set the h_vmem on each exec host to 150% of its real RAM quantity. Since our nodes do not have swap partitions the infamous Linux OOM killer kicks in from time to time...
- some jobs allocate large amounts of RAM in the beginning and then release it. Since h_vmem is a consumable resource it happens that some nodes have free RAM and CPU slots but no jobs can run there since h_vmem does not increase automatically when RAM is freed but only when jobs are finished.
Hope that helps,
On 09.06.2009, at 13:43, futuritymmx wrote:
We have a consumable complex called virtual_free / vf which we use to make sure that jobs dont use more memory than a machine has available. The users specify the maximum amount of memory they feel that their jobs will use.
However, some users underestimate the amount on memory they use and others have jobs with memory leaks. Wed like to start using h_vmem to kill jobs that consume more memory that expected.
I was wondering if h_vmem is automatically used by the grid engine and scheduler to work out where jobs should be run, preventing too many jobs running on a machine that require a lot of memory? If not, will our users still need to specify the vf complex as well as the h_vmem complex?
Thanks for any help you can give.
More information about the gridengine-users