[GE users] PE Accounting

reuti reuti at staff.uni-marburg.de
Fri Nov 13 15:00:37 GMT 2009


Am 13.11.2009 um 15:48 schrieb templedf:

> Can someone point out what I'm missing here?  I have a tightly
> integrated parallel job.  It runs for about 11 minutes.  Just  
> before it
> ends, I run a qstat -j, and I get the following usage line:
>
> usage    1:               cpu=00:40:53, mem=652.32980 GBs, io=0.00000,
> vmem=23.023M, maxvmem=50.125M
>
> As far as I can tell, it's working as intended.  I have 40 CPU-minutes
> in 11 minutes of execution, which corresponds to the 4 slots on which
> the job is running.  BUT, after the job ends, I run qacct -j, and it
> shows me:
>
> cpu          9.040
> mem          0.369
> io           0.000
> iow          0.000
> maxvmem      50.125M
>
> It got the maxvmem right, but the CPU time is clearly only for the
> master task.  Even better, a little further up in the qacct output  
> we see:
>
> ru_wallclock 726
> ru_utime     1356.589
> ru_stime     104.770
>
> which says that the job consumed twice as much CPU time as wallclock
> time.  Huh?

Is it a job with 2 threads on the master node of the job? This is  
what the kernel sees on it's own, while "cpu" is accounted by the  
additional GID.

>
> Why do I have three different CPU time values that don't agree with  
> each
> other?  Am I just misunderstanding the numbers?

Which version of SGE and which startup-method (builtin or traditional  
rsh).

How many records do you have in the qaact output for the job - one or  
more?

What is the setting of "accounting_summary" in the PE?

-- Reuti

------------------------------------------------------
http://gridengine.sunsource.net/ds/viewMessage.do?dsForumId=38&dsMessageId=226688

To unsubscribe from this discussion, e-mail: [users-unsubscribe at gridengine.sunsource.net].



More information about the gridengine-users mailing list