[GE users] Using h_vmem to prevent out of control memory usage

mhanby mhanby at uab.edu
Mon Mar 29 19:01:06 BST 2010

Howdy (GE 6.2u5 on CentOS 5 x86_64),

I'm testing using h_vmem as a way to prevent jobs from getting out of control with their memory usage.

I ran my test job without using h_vmem and the job completed successfully reporting
"Max vmem         = 1.920G".

I repeated the same job, but this time adding "-l h_vmem=2G" and it fails very quick:
+ java -jar test.jar

Error occurred during initialization of VM
Could not reserve enough space for object heap
Could not create the Java virtual machine.

I also tried "-l h_vmem=3G" and "-l h_vmem=3G,h_stack=32M" and they also fail very quick with the similar error.

If I specify "-l h_vmem=10G,h_stack=32M" then the job runs successfully.

While the job is running if I look at various metrics, memory is always below 2GB
qhost shows MEMUSE of 990.8M:
compute-1-3             lx26-amd64      8  9.08   15.7G  990.8M  996.2M     0.0

qstat -j usage shows:
usage    1:                 cpu=00:38:59, mem=1470.58717 GBs, io=0.00000, vmem=1.262G, maxvmem=1.490G

Shouldn't my job run just fine with h_vmem=2G based on these results? If it matters, I haven't made any memory consumable yet until I can better understand how this will affect the users:
$ qconf -sc|grep mem
h_vmem              h_vmem     MEMORY      <=    YES         NO         0        0
mem_free            mf         MEMORY      <=    YES         NO         0        0
mem_total           mt         MEMORY      <=    YES         NO         0        0
mem_used            mu         MEMORY      >=    YES         NO         0        0
s_vmem              s_vmem     MEMORY      <=    YES         NO         0        0

Mike Hanby
mhanby at uab.edu
Information Systems Specialist II
IT HPCS / Research Computing


To unsubscribe from this discussion, e-mail: [users-unsubscribe at gridengine.sunsource.net].

More information about the gridengine-users mailing list