[GE users] Issue seen in 6.2U4 : memory values reported by SGE too low compared to top output on linux systems

hawson beckerjes at mail.nih.gov
Thu Aug 12 21:23:13 BST 2010


On Thu, Aug 12, 2010 at 04:19:26PM -0400, rayson wrote:
>Hi Jenny,
>
>What is the online job usage?? You can get that from the output of
>"qstat -j <jod>" while the job is running.

Just to chime in on this:  we've seen memory issues using 6.2u5 as well.
I don't have any hard numbers handy, but the value reported by qstat
and qacct are roughly half the amount reported via 'top', 'ps', and
similar tools.

This is on an lx26-amd64 box, custom-compiled version of SGE.

>
>Rayson
>
>
>
>On Thu, Aug 12, 2010 at 7:21 AM, reuti <reuti at staff.uni-marburg.de> wrote:
>> Hi,
>>
>> Am 12.08.2010 um 04:27 schrieb jenny:
>>
>>> I can confirm that both vmem and maxvmem values shown by qstat (and qacct) are modulo 2^32 in bytes, at least on lx24-amd64. Here is the output from qacct for a simple C program that calloc()-s some good deal of memory of various sizes:
>>>
>>> calloc 3 GiB:
>>> cpu          3.540
>>> mem          7.374
>>> maxvmem      3.013G
>>>
>>> calloc 7 GiB:
>>> cpu          8.120
>>> mem          22.271
>>> maxvmem      3.013G
>>>
>>> calloc 11 GiB:
>>> cpu          12.710
>>> mem          34.406
>>> maxvmem      3.013G
>>>
>>> calloc 4 GiB:
>>> cpu          4.730
>>> mem          0.012
>>> maxvmem      13.074M
>>>
>>> It looks like a bug in SGE to me - vmem's value is converted to 32-bit somewhere along the path (probably as early as in the shepherd). That results in incorrect value for the time integral in "mem" as well.
>>>
>>> Does anybody met the same problem?
>>
>> is this a copy/paste of this post?
>>
>> http://gridengine.sunsource.net/ds/viewMessage.do?dsForumId=38&dsMessageId=247921
>>
>> -- Reuti
>>
>>
>>
>>>
>>> 2010-08-11
>>> ?????????  Jenny_Lu
>>> ????????????
>>> ???????????????????????????
>>> lulh at genomics.org.cn
>>> Tel:075525273811
>>> Mobile:15986782583  62583
>>
>> ------------------------------------------------------
>> http://gridengine.sunsource.net/ds/viewMessage.do?dsForumId=38&dsMessageId=273960
>>
>> To unsubscribe from this discussion, e-mail: [users-unsubscribe at gridengine.sunsource.net].
>>
>
>------------------------------------------------------
>http://gridengine.sunsource.net/ds/viewMessage.do?dsForumId=38&dsMessageId=274080
>
>To unsubscribe from this discussion, e-mail: [users-unsubscribe at gridengine.sunsource.net].

-- 
Jesse Becker
NHGRI Linux support (Digicon Contractor)

------------------------------------------------------
http://gridengine.sunsource.net/ds/viewMessage.do?dsForumId=38&dsMessageId=274081

To unsubscribe from this discussion, e-mail: [users-unsubscribe at gridengine.sunsource.net].



More information about the gridengine-users mailing list