[GE users] CPU time = Wallclock time?

reuti reuti at staff.uni-marburg.de
Wed Nov 5 17:22:24 GMT 2008


Hi,

Am 05.11.2008 um 15:57 schrieb Wolfgang Friebel:

> On Wed, 5 Nov 2008, orlandorichards wrote:
>
> I have noticed that behaviour as well. However I believe that this  
> must
> not be a bug, but could be a "feature" ;-)
>
> If I want to do the cpu accounting properly, I have to sum up utime 
> +stime.
> Unfortunately there seems to be really a bug (in 6.0u9) which we  
> are using
> currently: These parameters are not set if the job was killed by  
> SGE due
> to exceeding some limits.
>
> The recorded cpu value however seems to be the parameter that is  
> used for
> job scheduling. You can give relative weights with the parameter
> usage_weight_list: cpu=wcpu,mem=wmem,io=wio (see man sched_conf) when
> using the sharetree algorithm.

I don't see this. Even with cpu=0,mem=1,io=0 I get the correct times  
in qacct for cpu (okay, 6.2)

> As SGE cannot know what is the cpu/wallclock ratio for a given job,  
> SGE
> has to calculate the tickets (before job start) based on the time
> requested (cpu or wallclock, should not make a difference, as the  
> ratio is
> unknown)

It knows the ratio. It knows when the job was started and where it's  
now. The recorded CPU is the integral of the %cpu of a job for a  
timeframe. The tasks of a job are detected by the additional group id  
for each job. Unless you set ACCT_RESERVED_USAGE=TRUE (or for now  
with the bug SHARETREE_RESERVED_USAGE=TRUE), see below.

> That makes me believe that the parameter cpu is correctly reporting  
> the
> used wallclock time, when SHARETREE_RESERVED_USAGE=TRUE was given, as
> effectively the wallclock time was used in the share calculation.

This is just, what SGE can do. Imagine you oversubscribe a node with  
some queues and different nice values and/or functional policy. As  
nice values are relative in Linux, SGE would have to compute the  
time, the job would have, gotten if it would have put some load on  
the granted cores to get a real "correct" ACCT_RESERVED_USAGE.

> This cpu parameter is even multiplied with a load scaling factor.
> This could be another bug in the code, as we observed, that the  
> factor is
> not related to the host, where the job was running but is just one  
> of the
> configured load scaling factors (the first one?)

Can you please post an example? You mean the load scaling in the  
exechost configuration - right?


> Whether the observed behaviour is really a bug or not has probably  
> to be
> answered by the author of the code only. The documentation at least  
> leaves
> room for interpretation:
>     cpu    The cpu time usage in seconds.
> Could that be interpreted as "The cpu" "time usage" i.e. how long  
> was a
> cpu blocked instead of how long was the cpu busy?

I wouldn't call it blocked. It's more, how long the job was on the  
system. Whether the job put any load on it or not. This is the  
purpose of the ACCT_RESERVED_USAGE=TRUE. Maybe a reference to these  
two execd_params should be added to man accounting.

As long as "ACCT_RESERVED_USAGE=FALSE" (when it got corrected), it's  
the integral time. Otherwise the time on a node multiplied by the  
number of granted slots (suppose the node has more than one core).  
E.g a threaded application without any qrsh calls.

-- Reuti


> -- 
> Wolfgang Friebel                   Deutsches Elektronen-Synchrotron  
> DESY
> Phone/Fax:  +49 33762 77372/216    Platanenallee 6
> Mail: Wolfgang.Friebel AT desy.de  D-15738 Zeuthen  Germany
>
>> Reuti wrote:
>>> Hi,
>>>
>>> Am 22.10.2008 um 12:53 schrieb Orlando Richards:
>>>
>>>> We seem to have a problem with CPU time always being accounted as
>>>> equal to Wallclock time (or sometimes 1s higher) - even if the  
>>>> job is
>>>> just a "sleep 20s" job. The UTIME and STIME report correctly  
>>>> though.
>>>>
>>>> We're running SGE 6.1u4.
>>>>
>>>> We have
>>>> execd_params                 SHARETREE_RESERVED_USAGE=TRUE \
>>>>                              ACCT_RESERVED_USAGE=FALSE
>>>>
>>>> so would expect the CPU time to be recorded as roughly UTIME +  
>>>> STIME -
>>>> but this is not the case.
>>>>
>>>> I tried setting SHARETREE_RESERVED_USAGE to FALSE as well, to  
>>>> see if
>>>> it made any difference, and suddenly we get the expected behaviour
>>>> (CPU time = 0, wallclock = 20).
>>>>
>>>> Does anyone know if this is expected behaviour?
>>>
>>> something is really broken (I check in 6.2). They seem to operate in
>>> they way, that SHARETREE_RESERVED_USAGE refers to the accounting  
>>> file.
>>> Whether ACCT_RESERVED_USAGE operates the same way for the  
>>> sharetree I
>>> didn't check.
>>>
>>> Changing SHARETREE_RESERVED_USAGE between TRUE and FALSE shows
>>> constantly a changed behvaior for the accounting record. This  
>>> even works
>>> for parallel jobs then as expected.
>>>
>>>> Is there anything we can do to correct it?
>>>
>>> Fixing the source ;-) So an issue should be filed for it.
>>>
>>> -- Reuti
>>>
>>>
>>>> Sample qacct -j JOBID output for a 20s sleep job:
>>>>
>>>>
>>>> ==============================================================
>>>> qname        ecdf
>>>> hostname     node005.beowulf.cluster
>>>> group        is_iti_ug
>>>> owner        orichard
>>>> project      ecdf_baseline
>>>> department   defaultdepartment
>>>> jobname      simple.sh
>>>> jobnumber    1445888
>>>> taskid       undefined
>>>> account      sge
>>>> priority     5
>>>> qsub_time    Wed Oct 22 11:51:42 2008
>>>> start_time   Wed Oct 22 11:52:18 2008
>>>> end_time     Wed Oct 22 11:52:38 2008
>>>> granted_pe   NONE
>>>> slots        1
>>>> failed       0
>>>> exit_status  0
>>>> ru_wallclock 20
>>>> ru_utime     0
>>>> ru_stime     0
>>>> ru_maxrss    0
>>>> ru_ixrss     0
>>>> ru_ismrss    0
>>>> ru_idrss     0
>>>> ru_isrss     0
>>>> ru_minflt    1622
>>>> ru_majflt    0
>>>> ru_nswap     0
>>>> ru_inblock   0
>>>> ru_oublock   0
>>>> ru_msgsnd    0
>>>> ru_msgrcv    0
>>>> ru_nsignals  0
>>>> ru_nvcsw     30
>>>> ru_nivcsw    4
>>>> cpu          20
>>>> mem          40.020
>>>> io           0.000
>>>> iow          0.000
>>>> maxvmem      103.973M
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> --
>>>>             --
>>>>    Dr Orlando Richards
>>>>   Information Services
>>>> IT Infrastructure Division
>>>>        Unix Section
>>>>     Tel: 0131 650 4994
>>>>
>>>> The University of Edinburgh is a charitable body, registered in
>>>> Scotland, with registration number SC005336.
>>>>
>>>> ------------------------------------------------------------------- 
>>>> --
>>>> To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
>>>> For additional commands, e-mail: users- 
>>>> help at gridengine.sunsource.net
>>>>
>>>
>>>
>>> -------------------------------------------------------------------- 
>>> -
>>> To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
>>> For additional commands, e-mail: users-help at gridengine.sunsource.net
>>>
>>>
>>
>>
>>
>
> ------------------------------------------------------
> http://gridengine.sunsource.net/ds/viewMessage.do? 
> dsForumId=38&dsMessageId=88106
>
> To unsubscribe from this discussion, e-mail: [users- 
> unsubscribe at gridengine.sunsource.net].

------------------------------------------------------
http://gridengine.sunsource.net/ds/viewMessage.do?dsForumId=38&dsMessageId=88131

To unsubscribe from this discussion, e-mail: [users-unsubscribe at gridengine.sunsource.net].



More information about the gridengine-users mailing list