[GE users] Grid Engine vs. Java Jobs

Reuti reuti at staff.uni-marburg.de
Sat Nov 26 00:01:06 GMT 2005


Hi Andreas,

Am 25.11.2005 um 10:11 schrieb Andreas Haupt:

> Hello,
>
> I have a problem with java jobs running under grid engine on  
> Scientific Linux 3. Grid engine seems to sum up the memory usage of  
> all java threads and define this as job memory usage. But that's  
> not correct as all threads use the same memory.
>
> This results in grid engine killing the job:
>
> 11/25/2005 08:26:57|execd|globe1|W|job 791447 exceeds job hard limit
> "h_vmem" of queue "globe-short.q at globe1.ifh.de" (2071642112.00000 >
> limit:1887436800.00000) - sending SIGKILL
>
> But it doesn't use over 2G main memory. The actual memory usage is  
> "only" about 700M as you can see here:
>
> USER       PID %CPU %MEM   VSZ  RSS TTY      STAT START   TIME COMMAND
> ahaupt   21072 47.0  1.2 694312 24820 pts/1  R    09:46   0:01 /opt/ 
> products/java/1.5.0/bin/java -DX509_USER_PROXY=/afs/ifh.de/user/a/ 
> ahaupt/.globus/proxy-cert -DRGMA_HOME=/opt/glite  
> PrimaryProducerExample ahaupt21066 at globe1.ifh.de
>
> As a workaround you can tell java to limit its memory consumption  
> like this:
>
> java -Xms64m -Xmx128m
>
> but this cannot be the final solution. Even then grid engine seems  
> to have problems in determining the job's maximum memory usage. The  
> following lines are in the correct order and list the usage of the  
> same job (qstat -j <jobid> | grep usage)
>
> usage    1:                 cpu=00:00:22, mem=12.30469 GBs,  
> io=0.00000, vmem=N/A, maxvmem=1.758G
>
> usage    1:                 cpu=00:05:09, mem=170.50781 GBs,  
> io=0.00000, vmem=N/A, maxvmem=469.355M

which version of SGE are you using and on which platform (why is  
vmem=N/A for you)? What is:

qacct -j 791447

telling you about the job?

Cheers - Reuti

> AFAIK maxvmem contains the job's maximum memory usage ever. Why  
> does it shrink over the time?
>
> My real problem is that I plan to open our local GE farm to the LHC  
> computing grid. Jobs from all over the world will be calculated on  
> our farm. I do not have any influence on how the jobs will look  
> like and how they can work around grid engine bugs. The only  
> solution would be to say goodbye to grid engine and use a batch  
> system that can handle java jobs correctly.
>
> Greetings
> Andreas
>
> -- 
> | Andreas Haupt                      | E-Mail:  andreas.haupt at desy.de
> |  DESY Zeuthen                      | WWW:     http://www.desy.de/ 
> ~ahaupt
> |  Platanenallee 6                   | Phone:   +49/33762/7-7359
> |  D-15738 Zeuthen                   | Fax:     +49/33762/7-7216
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
> For additional commands, e-mail: users-help at gridengine.sunsource.net


---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
For additional commands, e-mail: users-help at gridengine.sunsource.net




More information about the gridengine-users mailing list