[GE users] Grid Engine vs. Java Jobs

Andreas Haupt ahaupt at ifh.de
Fri Nov 25 09:11:32 GMT 2005


I have a problem with java jobs running under grid engine on Scientific 
Linux 3. Grid engine seems to sum up the memory usage of all java threads 
and define this as job memory usage. But that's not correct as all threads 
use the same memory.

This results in grid engine killing the job:

11/25/2005 08:26:57|execd|globe1|W|job 791447 exceeds job hard limit
"h_vmem" of queue "globe-short.q at globe1.ifh.de" (2071642112.00000 >
limit:1887436800.00000) - sending SIGKILL

But it doesn't use over 2G main memory. The actual memory usage is "only" 
about 700M as you can see here:

ahaupt   21072 47.0  1.2 694312 24820 pts/1  R    09:46   0:01 
/opt/products/java/1.5.0/bin/java -DX509_USER_PROXY=/afs/ifh.de/user/a/ahaupt/.globus/proxy-cert 
-DRGMA_HOME=/opt/glite PrimaryProducerExample ahaupt21066 at globe1.ifh.de

As a workaround you can tell java to limit its memory consumption like 

java -Xms64m -Xmx128m

but this cannot be the final solution. Even then grid engine seems to have 
problems in determining the job's maximum memory usage. The following 
lines are in the correct order and list the usage of the same job (qstat 
-j <jobid> | grep usage)

usage    1:                 cpu=00:00:22, mem=12.30469 GBs, io=0.00000, 
vmem=N/A, maxvmem=1.758G

usage    1:                 cpu=00:05:09, mem=170.50781 GBs, io=0.00000, 
vmem=N/A, maxvmem=469.355M

AFAIK maxvmem contains the job's maximum memory usage ever. Why does it 
shrink over the time?

My real problem is that I plan to open our local GE farm to the LHC 
computing grid. Jobs from all over the world will be calculated on our 
farm. I do not have any influence on how the jobs will look like and how 
they can work around grid engine bugs. The only solution would be to say 
goodbye to grid engine and use a batch system that can handle java jobs 


| Andreas Haupt                      | E-Mail:  andreas.haupt at desy.de
|  DESY Zeuthen                      | WWW:     http://www.desy.de/~ahaupt
|  Platanenallee 6                   | Phone:   +49/33762/7-7359
|  D-15738 Zeuthen                   | Fax:     +49/33762/7-7216

To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
For additional commands, e-mail: users-help at gridengine.sunsource.net

More information about the gridengine-users mailing list