[GE users] h_vmem, virtual_free

Heywood, Todd heywood at cshl.edu
Fri Feb 23 17:55:53 GMT 2007


Hi Reuti,

Thanks for the reply. Answers:

a) I have not made any changes to h_vmem in the queues, i.e. it is set
to "INFINITY".

b) Thus h_data and h_stack are also "INFINITY". H_stack doesn't need to
be limited since h_vmem is unlimited?

c) and d) "ulimit -Hs" gives "unlimited" on the node, and also when the
command is submitted to the node via a qsub script. Also, "ulimit -Ha"
gives the same results outside and within SGE on the same node.

So, I'm still puzzled... thanks for any ideas.

Todd

p.s. why is virtual_free a host attribute, but h_vmem/s_vmem a queue
attribute?



-----Original Message-----
From: Reuti [mailto:reuti at staff.uni-marburg.de] 
Sent: Friday, February 23, 2007 12:23 PM
To: users at gridengine.sunsource.net
Subject: Re: [GE users] h_vmem, virtual_free

Hi,

Am 23.02.2007 um 17:15 schrieb Heywood, Todd:

> Hi,
>
>
>
> I have a puzzle, where a certain program (vmatch) runs fine outside  
> of SGE on a 2GB node. However, it does not run when submitted as a  
> job to SGE, and using strace shows it runs into memory allocation  
> errors. I have resource virtual_free defined as requestable and  
> consumable, and h_vmem is requestable but NOT consumable. Using "-l  
> virtual_free=1.9G" on a 2GB node, or "-l virtual_free=3.8G" for a  
> 4GB node works as expected in that the job runs only when there is  
> enough memory available and SGE subtracts the requested amount from  
> virtual_free when the job starts running. However the job still  
> gets the memory allocation error.
>
>
>
> However, if I use "-l h_vmem=4G" and submit the job to either a 4GB  
> or 2GB (!) node, the job runs fine with no errors.
>
>
>
> This makes no sense to me, especially when the job runs on a 2GB  
> node with h_vmem=4G specified. Can anyone explain?
>
>
>
> Here's the qhost output for a 4GB node. I'm not sure why h_vmem  
> isn't reported (my global execution host reporting variables are  
> defined to be: cpu, h_vmem, mem_free, np_load_avg, s_vmem,  
> virtual_free).
h_vmem is a queue attribute, so qstat -F should show it.

a) is there any h_vmem defined in the queues, which will be taken if  
the user doesn't request it?

b) some programs need to limit the h_stack to an even lower value, if  
and only if h_vmem is other than unlimited. Just to note, that SGE  
will also set h_data and h_stack to the same value as h_vmem, unless  
they are defined with a lower value than h_vmem.

c) what "ulimit -Ha" and "ulimit -Hs" showing on the node?

d) you could also use c) in a jobscript to check the defined limit  
for this job.

-- Reuti

>
>
>
>
> [root at bhmnode2 tmp]# qhost -F -h blade1
>
> HOSTNAME                ARCH         NCPU  LOAD  MEMTOT  MEMUSE   
> SWAPTO  SWAPUS
>
> ----------------------------------------------------------------------

> ---------
>
> global                  -               -     -       -        
> -       -       -
>
> blade1                  lx24-amd64      4  0.17    3.9G  242.6M     
> 1.0G   21.2M
>
>    hl:arch=lx24-amd64
>
>    hl:num_proc=4.000000
>
>    hl:mem_total=3.861G
>
>    hl:swap_total=1.004G
>
>    hl:virtual_total=4.865G
>
>    hl:load_avg=0.170000
>
>    hl:load_short=0.000000
>
>    hl:load_medium=0.170000
>
>    hl:load_long=0.230000
>
>    hl:mem_free=3.624G
>
>    hl:swap_free=1006.340M
>
>    hc:virtual_free=3.800G
>
>    hl:mem_used=242.598M
>
>    hl:swap_used=21.246M
>
>    hl:virtual_used=263.844M
>
>    hl:cpu=0.000000
>
>    hl:tmpfree=59.128G
>
>    hl:tmptot=64.702G
>
>    hl:tmpused=2.287G
>
>    hl:np_load_avg=0.042500
>
>    hl:np_load_short=0.000000
>
>    hl:np_load_medium=0.042500
>
>    hl:np_load_long=0.057500
>
>
>
>
>
> Thanks for any ideas!
>
>
>
> Todd Heywood
>
>
>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
For additional commands, e-mail: users-help at gridengine.sunsource.net

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
For additional commands, e-mail: users-help at gridengine.sunsource.net




More information about the gridengine-users mailing list