[GE users] Resource usage for OpenMP jobs

Ron Chen ron_chen_123 at yahoo.com
Sun Mar 26 18:35:15 BST 2006


    [ The following text is in the "iso-8859-1" character set. ]
    [ Your display is set for the "ISO-8859-10" character set.  ]
    [ Some special characters may be displayed incorrectly. ]

Hi Jean-Paul,

I am getting the impression that it is a Linux kernel bug and
it's already fixed rather than an SGE bug!

top shows the CPU time:
28264 ron  25   0 28156  508  388 R 99.9  0.0   4227:13 a.out

And let's get to the /proc filesystem, the 14th entry in "stat"
is the CPU time in number of ticks (1/100 second):

> cat /proc/28264/stat
28264 (a.out) R 28013 28264 28013 0 -1 0 144 0 0 0 25371022 2559
0 0 25 0 4 0 124260482 28831744 127 18446744073709551615 4194304
4196244 548682068960 18446744073709551615 4195831 0 0 0 0
18446744073709551615 0 0 17 0 0 0

At the thread level:
> cat /proc/28264/task/*/stat
28264 (a.out) R 28013 28264 28013 0 -1 0 144 0 0 0 6365072 649 0
0 25 0 4 0 124260482 28831744 127 18446744073709551615 4194304
4196244 548682068960 18446744073709551615 4195831 0 0 0 0 0 0 0
17 0 0 0
28265 (a.out) R 28013 28264 28013 0 -1 64 14 0 0 0 6351246 629 0
0 25 0 4 0 124260482 28831744 127 18446744073709551615 4194304
4196244 548682068960 18446744073709551615 4195831 0 0 0 0 0 0 0
-1 1 0 0
28266 (a.out) R 28013 28264 28013 0 -1 64 5 0 0 0 6332587 643 0
0 25 0 4 0 124260482 28831744 127 18446744073709551615 4194304
4196244 548682068960 18446744073709551615 4195831 0 0 0 0 0 0 0
-1 1 0 0
28267 (a.out) R 28013 28264 28013 0 -1 64 2 0 0 0 6335575 638 0
0 25 0 4 0 124260482 28831744 127 18446744073709551615 4194304
4196244 548682068960 18446744073709551615 4195831 0 0 0 0 0 0 0
-1 0 0 0

So if we look at /proc/28264/stat, 25371022/60/100 = 4228.503

And if we add up the CPU time of all the threads:
6365072+6351246+6332587+6335575=2538448
=> 25384480/60/100 = 4230

Note that SGE currently scans the first level of /proc (ie. the
process level). I've attached diff for issue 2009 to scan the
CPU usage of the threads. However, since the way SGE uses also
works, I think we don't need to check in the code now.

Can you verify on your system and see if /proc is giving you
similar results in terms of the CPU usage summation?

Or, is there anything special about the OpenMP job that is NOT
letting /proc to collect/report the CPU time correctly!?

 -Ron



--- Jean-Paul Minet <minet at cism.ucl.ac.be> wrote:
> you are right,  here is the output for one finished smp job:
> 
> qsub_time    Mon Mar  6 18:36:24 2006
> start_time   Mon Mar  6 21:10:42 2006
> end_time     Mon Mar 13 09:38:59 2006
> granted_pe   smp
> slots        2
> failed       0
> exit_status  0
> ru_wallclock 563297
> ru_utime     869097
> ru_stime     63187
> 
> thanks for your input
> 
> Jean-Paul
> 
> > If this is done correctly, then at least fair share usage is
> working
> > as expected... only that you won't see the CPU time used by
> the slave
> > threads until the end.
> > 
> > Rayson
> > 
> > 
>

__________________________________________________
Do You Yahoo!?
Tired of spam?  Yahoo! Mail has the best spam protection around 
http://mail.yahoo.com 

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
For additional commands, e-mail: users-help at gridengine.sunsource.net




More information about the gridengine-users mailing list