[GE users] Re: Re: [GE users] Issue seen in 6.2U4 : memory values reported by SGEtoo low compared to top output on linux systems

m0zes adam.tygart at gmail.com
Fri Aug 13 16:00:15 BST 2010


    [ The following text is in the "utf-8" character set. ]
    [ Your display is set for the "ISO-8859-10" character set.  ]
    [ Some characters may be displayed incorrectly. ]

The bug still exists in 6.2u5. It was originally reported for u5.

--
Adam Tygart
Beocat Sysadmin

On Fri, Aug 13, 2010 at 09:58, ron <ron_chen_123 at yahoo.com> wrote:
>
> Do both SGE 6.2u5 & SGE6.2u6 have this problem?
>
> If SGE6.2u5 still has this bug, then I will see if I can get thix fixed.
>
>  -Ron
>
>
> --- On Fri, 8/13/10, jenny <lulh at genomics.org.cn> wrote:
>
> such as the following job, qstat info says its mem usage is 3.887G, but on the server, the top info says its mem usage is >100g.
>
>
> # qstat -j 143871
> ==============================================================
> job_number:                 143871
> exec_file:                  job_scripts/143871
> submission_time:            Wed Aug 11 11:05:05 2010
> hard resource_list:         virtual_free=400G
> usage    1:                 cpu=8:20:05:15, mem=1940633.01964 GBs, io=1422.26572, vmem=3.887G, maxvmem=4.064G
>
>
> Mem:  1055302704k total, 833846592k used, 221456112k free,   110800k buffers
> Swap: 104864276k total,    21452k used, 104842824k free, 514290156k cached
>   PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND
> 16441 b  15   0  103g 101g  364 S 1571.4 10.1   2172:50 grape63mer
>
>
>
>
>
>
> 2010-08-13
> ________________________________
> ???  Jenny_Lu
> ????
> ?????????
> lulh at genomics.org.cn
> Tel:075525273811
> Mobile:15986782583  62583
> ________________________________
> ???? rayson
> ????? 2010-08-13  04:20:06
> ???? users
> ???
> ??? Re: [GE users] Issue seen in 6.2U4 : memory values reported by SGEtoo low compared to top output on linux systems
> Hi Jenny,
> What is the online job usage?? You can get that from the output of
> "qstat -j <jod>" while the job is running.
> Rayson
> On Thu, Aug 12, 2010 at 7:21 AM, reuti <reuti at staff.uni-marburg.de> wrote:
> > Hi,
> >
> > Am 12.08.2010 um 04:27 schrieb jenny:
> >
> >> I can confirm that both vmem and maxvmem values shown by qstat (and qacct) are modulo 2^32 in bytes, at least on lx24-amd64. Here is the output from qacct for a simple C program that calloc()-s some good deal of memory of various sizes:
> >>
> >> calloc 3 GiB:
> >> cpu          3.540
> >> mem          7.374
> >> maxvmem      3.013G
> >>
> >> calloc 7 GiB:
> >> cpu          8.120
> >> mem          22.271
> >> maxvmem      3.013G
> >>
> >> calloc 11 GiB:
> >> cpu          12.710
> >> mem          34.406
> >> maxvmem      3.013G
> >>
> >> calloc 4 GiB:
> >> cpu          4.730
> >> mem          0.012
> >> maxvmem      13.074M
> >>
> >> It looks like a bug in SGE to me - vmem's value is converted to 32-bit somewhere along the path (probably as early as in the shepherd). That results in incorrect value for the time integral in "mem" as well.
> >>
> >> Does anybody met the same problem?
> >
> > is this a copy/paste of this post?
> >
>http://gridengine.sunsource.net/ds/viewMessage.do?dsForumId=38&dsMessageId=247921
> >
> > -- Reuti
> >
> >
> >
> >>
> >> 2010-08-11
> >> ???  Jenny_Lu
> >> ????
> >> ?????????
> >> lulh at genomics.org.cn
> >> Tel:075525273811
> >> Mobile:15986782583  62583
> >
> > ------------------------------------------------------
>http://gridengine.sunsource.net/ds/viewMessage.do?dsForumId=38&dsMessageId=273960
> >
> > To unsubscribe from this discussion, e-mail: [users-unsubscribe at gridengine.sunsource.net].
> >
> ------------------------------------------------------
> http://gridengine.sunsource.net/ds/viewMessage.do?dsForumId=38&dsMessageId=274080
> To unsubscribe from this discussion, e-mail: [users-unsubscribe at gridengine.sunsource.net].
> __________ Information from ESET NOD32 Antivirus, version of virus signature database 5361 (20100812) __________
> The message was checked by ESET NOD32 Antivirus.
> http://www.eset.com

------------------------------------------------------
http://gridengine.sunsource.net/ds/viewMessage.do?dsForumId=38&dsMessageId=274304

To unsubscribe from this discussion, e-mail: [users-unsubscribe at gridengine.sunsource.net].



More information about the gridengine-users mailing list