[GE users] Re: Re: [GE users] Issue seen in 6.2U4 : memory values reported by SGEtoo low compared to top output on linux systems

jenny lulh at genomics.org.cn
Mon Aug 16 10:19:31 BST 2010


    [ The following text is in the "gb2312" character set. ]
    [ Your display is set for the "ISO-8859-10" character set.  ]
    [ Some characters may be displayed incorrectly. ]

any solution for this problem?


2010-08-16
________________________________
???  Jenny_Lu
????
?????????
lulh at genomics.org.cn<mailto:lulh at genomics.org.cn>
Tel:075525273811
________________________________
???? laotsao
????? 2010-08-13  04:35:24
???? users
??? users at gridengine.sunsource.net
??? Re: [GE users] Issue seen in 6.2U4 : memory values reported by SGEtoo low compared to top output on linux systems
Anyone test u6?
Sent from my iPad
On Aug 12, 2010, at 4:23 PM, hawson <beckerjes at mail.nih.gov> wrote:
> On Thu, Aug 12, 2010 at 04:19:26PM -0400, rayson wrote:
>> Hi Jenny,
>>
>> What is the online job usage?? You can get that from the output of
>> "qstat -j <jod>" while the job is running.
>
> Just to chime in on this:  we've seen memory issues using 6.2u5 as well.
> I don't have any hard numbers handy, but the value reported by qstat
> and qacct are roughly half the amount reported via 'top', 'ps', and
> similar tools.
>
> This is on an lx26-amd64 box, custom-compiled version of SGE.
>
>>
>> Rayson
>>
>>
>>
>> On Thu, Aug 12, 2010 at 7:21 AM, reuti <reuti at staff.uni-marburg.de> wrote:
>>> Hi,
>>>
>>> Am 12.08.2010 um 04:27 schrieb jenny:
>>>
>>>> I can confirm that both vmem and maxvmem values shown by qstat (and qacct) are modulo 2^32 in bytes, at least on lx24-amd64. Here is the output from qacct for a simple C program that calloc()-s some good deal of memory of various sizes:
>>>>
>>>> calloc 3 GiB:
>>>> cpu          3.540
>>>> mem          7.374
>>>> maxvmem      3.013G
>>>>
>>>> calloc 7 GiB:
>>>> cpu          8.120
>>>> mem          22.271
>>>> maxvmem      3.013G
>>>>
>>>> calloc 11 GiB:
>>>> cpu          12.710
>>>> mem          34.406
>>>> maxvmem      3.013G
>>>>
>>>> calloc 4 GiB:
>>>> cpu          4.730
>>>> mem          0.012
>>>> maxvmem      13.074M
>>>>
>>>> It looks like a bug in SGE to me - vmem's value is converted to 32-bit somewhere along the path (probably as early as in the shepherd). That results in incorrect value for the time integral in "mem" as well.
>>>>
>>>> Does anybody met the same problem?
>>>
>>> is this a copy/paste of this post?
>>>
>>> http://gridengine.sunsource.net/ds/viewMessage.do?dsForumId=38&dsMessageId=247921
>>>
>>> -- Reuti
>>>
>>>
>>>
>>>>
>>>> 2010-08-11
>>>> ?????????  Jenny_Lu
>>>> ????????????
>>>> ???????????????????????????
>>>> lulh at genomics.org.cn
>>>> Tel:075525273811
>>>> Mobile:15986782583  62583
>>>
>>> ------------------------------------------------------
>>> http://gridengine.sunsource.net/ds/viewMessage.do?dsForumId=38&dsMessageId=273960
>>>
>>> To unsubscribe from this discussion, e-mail: [users-unsubscribe at gridengine.sunsource.net].
>>>
>>
>> ------------------------------------------------------
>> http://gridengine.sunsource.net/ds/viewMessage.do?dsForumId=38&dsMessageId=274080
>>
>> To unsubscribe from this discussion, e-mail: [users-unsubscribe at gridengine.sunsource.net].
>
> --
> Jesse Becker
> NHGRI Linux support (Digicon Contractor)
>
> ------------------------------------------------------
> http://gridengine.sunsource.net/ds/viewMessage.do?dsForumId=38&dsMessageId=274081
>
> To unsubscribe from this discussion, e-mail: [users-unsubscribe at gridengine.sunsource.net].
------------------------------------------------------
http://gridengine.sunsource.net/ds/viewMessage.do?dsForumId=38&dsMessageId=274083
To unsubscribe from this discussion, e-mail: [users-unsubscribe at gridengine.sunsource.net].
__________ Information from ESET NOD32 Antivirus, version of virus signature database 5361 (20100812) __________
The message was checked by ESET NOD32 Antivirus.
http://www.eset.com



More information about the gridengine-users mailing list