[GE users] h_vmem, virtual_free

Reuti reuti at staff.uni-marburg.de
Sat Feb 24 10:47:03 GMT 2007


Am 23.02.2007 um 18:55 schrieb Heywood, Todd:

> Hi Reuti,
>
> Thanks for the reply. Answers:
>
> a) I have not made any changes to h_vmem in the queues, i.e. it is set
> to "INFINITY".
>
> b) Thus h_data and h_stack are also "INFINITY". H_stack doesn't  
> need to
> be limited since h_vmem is unlimited?
>
> c) and d) "ulimit -Hs" gives "unlimited" on the node, and also when  
> the
> command is submitted to the node via a qsub script. Also, "ulimit -Ha"
> gives the same results outside and within SGE on the same node.
>
> So, I'm still puzzled... thanks for any ideas.
>
> Todd
>
> p.s. why is virtual_free a host attribute, but h_vmem/s_vmem a queue
> attribute?

Some explanations are in the man pages of "complex" and  
"queue_conf" (near the end).

-- Reuti


>
>
>
> -----Original Message-----
> From: Reuti [mailto:reuti at staff.uni-marburg.de]
> Sent: Friday, February 23, 2007 12:23 PM
> To: users at gridengine.sunsource.net
> Subject: Re: [GE users] h_vmem, virtual_free
>
> Hi,
>
> Am 23.02.2007 um 17:15 schrieb Heywood, Todd:
>
>> Hi,
>>
>>
>>
>> I have a puzzle, where a certain program (vmatch) runs fine outside
>> of SGE on a 2GB node. However, it does not run when submitted as a
>> job to SGE, and using strace shows it runs into memory allocation
>> errors. I have resource virtual_free defined as requestable and
>> consumable, and h_vmem is requestable but NOT consumable. Using "-l
>> virtual_free=1.9G" on a 2GB node, or "-l virtual_free=3.8G" for a
>> 4GB node works as expected in that the job runs only when there is
>> enough memory available and SGE subtracts the requested amount from
>> virtual_free when the job starts running. However the job still
>> gets the memory allocation error.
>>
>>
>>
>> However, if I use "-l h_vmem=4G" and submit the job to either a 4GB
>> or 2GB (!) node, the job runs fine with no errors.
>>
>>
>>
>> This makes no sense to me, especially when the job runs on a 2GB
>> node with h_vmem=4G specified. Can anyone explain?
>>
>>
>>
>> Here's the qhost output for a 4GB node. I'm not sure why h_vmem
>> isn't reported (my global execution host reporting variables are
>> defined to be: cpu, h_vmem, mem_free, np_load_avg, s_vmem,
>> virtual_free).
> h_vmem is a queue attribute, so qstat -F should show it.
>
> a) is there any h_vmem defined in the queues, which will be taken if
> the user doesn't request it?
>
> b) some programs need to limit the h_stack to an even lower value, if
> and only if h_vmem is other than unlimited. Just to note, that SGE
> will also set h_data and h_stack to the same value as h_vmem, unless
> they are defined with a lower value than h_vmem.
>
> c) what "ulimit -Ha" and "ulimit -Hs" showing on the node?
>
> d) you could also use c) in a jobscript to check the defined limit
> for this job.
>
> -- Reuti
>
>>
>>
>>
>>
>> [root at bhmnode2 tmp]# qhost -F -h blade1
>>
>> HOSTNAME                ARCH         NCPU  LOAD  MEMTOT  MEMUSE
>> SWAPTO  SWAPUS
>>
>> --------------------------------------------------------------------- 
>> -
>
>> ---------
>>
>> global                  -               -     -       -
>> -       -       -
>>
>> blade1                  lx24-amd64      4  0.17    3.9G  242.6M
>> 1.0G   21.2M
>>
>>    hl:arch=lx24-amd64
>>
>>    hl:num_proc=4.000000
>>
>>    hl:mem_total=3.861G
>>
>>    hl:swap_total=1.004G
>>
>>    hl:virtual_total=4.865G
>>
>>    hl:load_avg=0.170000
>>
>>    hl:load_short=0.000000
>>
>>    hl:load_medium=0.170000
>>
>>    hl:load_long=0.230000
>>
>>    hl:mem_free=3.624G
>>
>>    hl:swap_free=1006.340M
>>
>>    hc:virtual_free=3.800G
>>
>>    hl:mem_used=242.598M
>>
>>    hl:swap_used=21.246M
>>
>>    hl:virtual_used=263.844M
>>
>>    hl:cpu=0.000000
>>
>>    hl:tmpfree=59.128G
>>
>>    hl:tmptot=64.702G
>>
>>    hl:tmpused=2.287G
>>
>>    hl:np_load_avg=0.042500
>>
>>    hl:np_load_short=0.000000
>>
>>    hl:np_load_medium=0.042500
>>
>>    hl:np_load_long=0.057500
>>
>>
>>
>>
>>
>> Thanks for any ideas!
>>
>>
>>
>> Todd Heywood
>>
>>
>>
>>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
> For additional commands, e-mail: users-help at gridengine.sunsource.net
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
> For additional commands, e-mail: users-help at gridengine.sunsource.net
>

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
For additional commands, e-mail: users-help at gridengine.sunsource.net




More information about the gridengine-users mailing list