[GE users] jobs killed - job ... exceeds job hard limit "h_vmem" of queue ...

reuti reuti at staff.uni-marburg.de
Wed Aug 12 14:01:17 BST 2009


    [ The following text is in the "utf-8" character set. ]
    [ Your display is set for the "ISO-8859-10" character set.  ]
    [ Some characters may be displayed incorrectly. ]

Hi,

Am 12.08.2009 um 02:56 schrieb vinc17:

> Jobs are often killed with an error like:
>
>   job 15476 exceeds job hard limit "h_vmem" of queue
>   "cas at volla.lip.ens-lyon.fr" (540127232.00000 > limit: 
> 536870912.00000)

is it intended that it uses just a little bit more, as you request  
540 MB in Maple? Is the limit set with lowercase (base 1000) or  
uppercase (base 1024) characters for G,M,K in the queue configuration  
or in the qsub command?

-- Reuti


> However at this time, such a job takes very little memory.
>
> For instance, here's an excerpt of the log of one of the jobs
> (I've included ps output in the logs, for all the processes I
> own):
>
> ---------------------------------------------------------------------- 
> --
>   RSS    SZ    SZ    VSZ CMD
>  6848  4464 33165 132660 perl -S /var/spool/sge/milip/volla/ 
> job_scripts/15478 -s=tomate --log --logtkill -r=1 -c=3 -l=4 -C -- 
> pgen=8 -o=-dmaple
>  6848  4464 33165 132660 perl -S /var/spool/sge/milip/volla/ 
> job_scripts/15505 -s=tomate --log --logtkill -r=1 -c=3 -l=4 -C -- 
> pgen=8 -o=-dmaple
>  6848  4464 33165 132660 perl -S /var/spool/sge/milip/volla/ 
> job_scripts/15517 -s=tomate --log --logtkill -r=1 -c=3 -l=4 -C -- 
> pgen=8 -o=-dmaple
>   496   552  1335   5340 ./tst-volla-1249985804-17760
>  6848  4464 33165 132660 perl -S /var/spool/sge/milip/volla/ 
> job_scripts/15518 -s=tomate --log --logtkill -r=1 -c=3 -l=4 -C -- 
> pgen=8 -o=-dmaple
>   500   552  1335   5340 ./tst-volla-1249996409-19447
>   500   552  1335   5340 ./tst-volla-1249935899-13082
>  5312  3628 27449 109796 perl ./test32f -b6790 -l4 -c -m3 -dmaple - 
> n32 def/def-p1961  
> 111010100001100000000000000000000000000000000000000000 0  
> 1099511627776 tmp/tst-volla-1249998089-19796 300
>   848   604 22574  90296 ps -o rss,size,sz,vsz,cmd
>
> [2009-08-11 13:47:37] Maple: startmaple (@maple = maple -q -s)
> [2009-08-11 13:47:37] Maple: Maple started, pid = 19948
>   RSS    SZ    SZ    VSZ CMD
>  6848  4464 33165 132660 perl -S /var/spool/sge/milip/volla/ 
> job_scripts/15478 -s=tomate --log --logtkill -r=1 -c=3 -l=4 -C -- 
> pgen=8 -o=-dmaple
>  6848  4464 33165 132660 perl -S /var/spool/sge/milip/volla/ 
> job_scripts/15505 -s=tomate --log --logtkill -r=1 -c=3 -l=4 -C -- 
> pgen=8 -o=-dmaple
>  6848  4464 33165 132660 perl -S /var/spool/sge/milip/volla/ 
> job_scripts/15517 -s=tomate --log --logtkill -r=1 -c=3 -l=4 -C -- 
> pgen=8 -o=-dmaple
>   496   552  1335   5340 ./tst-volla-1249985804-17760
>  6848  4464 33165 132660 perl -S /var/spool/sge/milip/volla/ 
> job_scripts/15518 -s=tomate --log --logtkill -r=1 -c=3 -l=4 -C -- 
> pgen=8 -o=-dmaple
>   500   552  1335   5340 ./tst-volla-1249996409-19447
>   500   552  1335   5340 ./tst-volla-1249935899-13082
>  5488  3760 27482 109928 perl ./test32f -b6790 -l4 -c -m3 -dmaple - 
> n32 def/def-p1961  
> 111010100001100000000000000000000000000000000000000000 0  
> 1099511627776 tmp/tst-volla-1249998089-19796 300
>  1332   544  2221   8884 /opt/maple/bin.X86_64_LINUX/cmaple -I /opt/ 
> maple/lib/include -q -s
>  2748  1728 25374 101496 /opt/maple/bin.X86_64_LINUX/mserver -kpipe  
> 4 -I /opt/maple/lib/include -q -s
>  1176   492   659   2636 /opt/maple/bin.X86_64_LINUX/mfsd 7 10
>   852   604 22574  90296 ps -o rss,size,sz,vsz,cmd
>
> Maple 10.06, X86 64 LINUX, Oct 2 2006 Build ID 255401
> ---------------------------------------------------------------------- 
> --
>
> But each time a job is killed (by SIGUSR2), the log stops at the line
> "Maple: Maple started, pid = ...", which there seems to be enough  
> memory.
> Anyone knows what's wrong?
>
> -- 
> Vincent Lef?vre <vincent at vinc17.org> - Web: <http://www.vinc17.org/>
> 100% accessible validated (X)HTML - Blog: <http://www.vinc17.org/ 
> blog/>
> Work: CR INRIA - computer arithmetic / Arenaire project (LIP, ENS- 
> Lyon)
>
> ------------------------------------------------------
> http://gridengine.sunsource.net/ds/viewMessage.do? 
> dsForumId=38&dsMessageId=211922
>
> To unsubscribe from this discussion, e-mail: [users- 
> unsubscribe at gridengine.sunsource.net].
>

------------------------------------------------------
http://gridengine.sunsource.net/ds/viewMessage.do?dsForumId=38&dsMessageId=211991

To unsubscribe from this discussion, e-mail: [users-unsubscribe at gridengine.sunsource.net].



More information about the gridengine-users mailing list