[GE users] Issue seen in 6.2U4 : memory values reported by SGE too low compared to top output on linux systems

laotsao laotsao at gmail.com
Thu Aug 12 21:46:12 BST 2010


Anyone test u6?


Sent from my iPad

On Aug 12, 2010, at 4:23 PM, hawson <beckerjes at mail.nih.gov> wrote:

> On Thu, Aug 12, 2010 at 04:19:26PM -0400, rayson wrote:
>> Hi Jenny,
>> 
>> What is the online job usage?? You can get that from the output of
>> "qstat -j <jod>" while the job is running.
> 
> Just to chime in on this:  we've seen memory issues using 6.2u5 as well.
> I don't have any hard numbers handy, but the value reported by qstat
> and qacct are roughly half the amount reported via 'top', 'ps', and
> similar tools.
> 
> This is on an lx26-amd64 box, custom-compiled version of SGE.
> 
>> 
>> Rayson
>> 
>> 
>> 
>> On Thu, Aug 12, 2010 at 7:21 AM, reuti <reuti at staff.uni-marburg.de> wrote:
>>> Hi,
>>> 
>>> Am 12.08.2010 um 04:27 schrieb jenny:
>>> 
>>>> I can confirm that both vmem and maxvmem values shown by qstat (and qacct) are modulo 2^32 in bytes, at least on lx24-amd64. Here is the output from qacct for a simple C program that calloc()-s some good deal of memory of various sizes:
>>>> 
>>>> calloc 3 GiB:
>>>> cpu          3.540
>>>> mem          7.374
>>>> maxvmem      3.013G
>>>> 
>>>> calloc 7 GiB:
>>>> cpu          8.120
>>>> mem          22.271
>>>> maxvmem      3.013G
>>>> 
>>>> calloc 11 GiB:
>>>> cpu          12.710
>>>> mem          34.406
>>>> maxvmem      3.013G
>>>> 
>>>> calloc 4 GiB:
>>>> cpu          4.730
>>>> mem          0.012
>>>> maxvmem      13.074M
>>>> 
>>>> It looks like a bug in SGE to me - vmem's value is converted to 32-bit somewhere along the path (probably as early as in the shepherd). That results in incorrect value for the time integral in "mem" as well.
>>>> 
>>>> Does anybody met the same problem?
>>> 
>>> is this a copy/paste of this post?
>>> 
>>> http://gridengine.sunsource.net/ds/viewMessage.do?dsForumId=38&dsMessageId=247921
>>> 
>>> -- Reuti
>>> 
>>> 
>>> 
>>>> 
>>>> 2010-08-11
>>>> ?????????  Jenny_Lu
>>>> ????????????
>>>> ???????????????????????????
>>>> lulh at genomics.org.cn
>>>> Tel:075525273811
>>>> Mobile:15986782583  62583
>>> 
>>> ------------------------------------------------------
>>> http://gridengine.sunsource.net/ds/viewMessage.do?dsForumId=38&dsMessageId=273960
>>> 
>>> To unsubscribe from this discussion, e-mail: [users-unsubscribe at gridengine.sunsource.net].
>>> 
>> 
>> ------------------------------------------------------
>> http://gridengine.sunsource.net/ds/viewMessage.do?dsForumId=38&dsMessageId=274080
>> 
>> To unsubscribe from this discussion, e-mail: [users-unsubscribe at gridengine.sunsource.net].
> 
> -- 
> Jesse Becker
> NHGRI Linux support (Digicon Contractor)
> 
> ------------------------------------------------------
> http://gridengine.sunsource.net/ds/viewMessage.do?dsForumId=38&dsMessageId=274081
> 
> To unsubscribe from this discussion, e-mail: [users-unsubscribe at gridengine.sunsource.net].

------------------------------------------------------
http://gridengine.sunsource.net/ds/viewMessage.do?dsForumId=38&dsMessageId=274083

To unsubscribe from this discussion, e-mail: [users-unsubscribe at gridengine.sunsource.net].



More information about the gridengine-users mailing list