[GE users] Arco tool results differ from qacct

Jana Olivova Jana.Olivova at Sun.COM
Mon May 21 16:40:39 BST 2007


    [ The following text is in the "ISO-8859-1" character set. ]
    [ Your display is set for the "ISO-8859-10" character set.  ]
    [ Some special characters may be displayed incorrectly. ]

Hi,

I don't see anything wrong with the query. You can also use the 
predefined Accounting per Department query, which does the same.

I checked my setup with MySQL database and I get the same results with 
both ARCo and qacct. I don't have any sensible data in my Postgres db, 
because I was using the same grid with 3 different databases. So the 
only month I can compare is is this one:

qacct -b 200705010000 -e 200705312359
Total System Usage
    WALLCLOCK         UTIME         STIME           CPU             
MEMORY                 IO                IOW
================================================================================================================
       889909             2            36           415              
0.275              0.000              0.000

ARCo Accounting per Department

2007-05-01
cpu 	mem 	io
defaultdepartment 	415.155821 	0.275125999999997 	0.0


The one explanation for this, of course, would be if the same database 
is used for more grids and/or (for February) that reporting was not 
enabled the whole time. Not sure if that is a likely scenario for you.

Regards,

Jana

John Mc-Nicholas XJ (GU/ETL) wrote:
> Hi Jana/Daniel
>
> In this case I use database :sge_job_usage, but I have also used the
> accounting database.
> qacct groups jobs according to the jobs start time? I've done the same
> for the SQL query.
> So this SQL SHOULD TOTAL UP THE MEMORY GBS for all the jobs started
> within each month.
>
>
> SQL:
> SELECT date_trunc('month', ju_start_time) AS month,
> SUM (ju_mem) AS "mem
> "  
> FROM sge_job_usage 
> WHERE ju_start_time  >  (current_timestamp - interval '1 year') 
> GROUP BY month
> ORDER BY month; 
>
> resulting table
> month   			mem  
>  
> 2007-02-01 00:00:00.0 532138.750717 
> 2007-03-01 00:00:00.0 5274933.144317 
> 2007-04-01 00:00:00.0 6884688.555405 
> 2007-05-01 00:00:00.0 2789895.540273 
>
> Here are the results from qacct command. Compare the MEMORY column to
> table above.
> The results differ by a significant amount. A query on ju_cpu results in
> a similar discrepency.
> qacct 
> johnick at seasub1[~]# qacct -b 200702010000 -e 200702312359
> Total System Usage
>     WALLCLOCK         UTIME         STIME           CPU
> MEMORY                 IO                IOW
> ========================================================================
> ========================================
>       2433584        289462        131581        854446
> 567582.583              0.000              0.000
> johnick at seasub1[~]# qacct -b 200703010000 -e 200703312359
> Total System Usage
>     WALLCLOCK         UTIME         STIME           CPU
> MEMORY                 IO                IOW
> ========================================================================
> ========================================
>       4753132       1041297         53389       2957120
> 3923641.991              0.000              0.000
> johnick at seasub1[~]# qacct -b 200704010000 -e 200704312359
> Total System Usage
>     WALLCLOCK         UTIME         STIME           CPU
> MEMORY                 IO                IOW
> ========================================================================
> ========================================
>       6118415       2063020        140069       4094226
> 5743492.079              0.000              0.000
> johnick at seasub1[~]# qacct -b 200705010000 -e 200705312359
> Total System Usage
>     WALLCLOCK         UTIME         STIME           CPU
> MEMORY                 IO                IOW
> ========================================================================
> ========================================
>       2746486        983188        156462       1761848
> 2388992.294              0.000              0.000
>
>
>
> -----Original Message-----
> From: Jana.Olivova at Sun.COM [mailto:Jana.Olivova at Sun.COM] 
> Sent: 18 May 2007 18:58
> To: users at gridengine.sunsource.net
> Subject: Re: [GE users] Arco tool results differ from qacct
>
> I have problem replicating the issue, though. I keep running jobs (using
> Maintrunk GE) and the numbers keep matching.
>
> Jana
>
> Daniel Templeton wrote:
>   
>> It may be worth noting that qacct and ARCo use different source data 
>> files.  qacct uses the accounting file, and ARCo uses the reporting 
>> file.  It is not inconceivable that there could be an issue such that 
>> the qmaster might write different data to the two files in some cases.
>>     
>
>   
>> Just a thought.
>>
>> Daniel
>>
>> Jana Olivova wrote:
>>     
>>> Hi John,
>>>
>>> I could check on the Arco side. I have checked my data and they are 
>>> both the same, except the rounding that appears in qacct. I do have, 
>>> however, very small sample of data. Frankly, I am not sure what would
>>>       
>
>   
>>> cause this. Arco only inserts the data that is given to it by the 
>>> qmaster, in the reporting file.
>>>
>>> Can you tell me what sql query did you use to obtain the data in ARCo
>>>       
>
>   
>>> and what database are you using?
>>>
>>> Jana Olivova
>>>
>>> John Mc-Nicholas XJ (GU/ETL) wrote:
>>>       
>>>> Hi All
>>>>
>>>> I am basically having the same problem that Todd Heywood had earlier
>>>>         
>
>   
>>>> in the year.
>>>> He gave up on Arco tool in the end , I hope I haven't got to do the 
>>>> same.
>>>>
>>>>         
>>>>> / Heywood, Todd wrote:/ >/> How does ACRo report time and memory? I
>>>>>           
>>>> assumed it would be the same as/ >/> for qacct, for which it is 
>>>> seconds and Gbytes (according to "man/ >/> accounting"). But qacct 
>>>> and ACRo are reporting different numbers. Unit/ >/> conversions 
>>>> don't account for the diffs/
>>>>
>>>> The Arco Tool produces nice graphs and the SQL works fine but when I
>>>>         
>
>   
>>>> compare to the output of QACCT , it is a completely different set of
>>>>         
>
>   
>>>> results.
>>>>
>>>> There is some correlation between the data. For example, Aprils 
>>>> usage is the highest in both sets of results & The users with the 
>>>> most usage also correspond in both sets of data.
>>>> But the actual data seems to be randomly out by an order of 20-30%.
>>>>
>>>> I'm specifically trying to extract grid jobs memory (Gigabyte
>>>> seconds) per month
>>>> For example the data for April
>>>> qacct -b 200704010000 -e 200704312359 MEMORY 5743492.079
>>>>
>>>> But the output in arco gives.........
>>>> 6324866.240448
>>>>
>>>> Is this a bug in ARCO/GRID ?
>>>> What would cause this behaviour?
>>>>
>>>> The only strange thing I've noticed is that I have 2 dbwriter 
>>>> process instead of 1 & 5 postmaster instead of 3.
>>>>
>>>>
>>>> sgeadm 1430 1422 0 May 10 ? 0:00 /bin/sh 
>>>> /grid/dbwriter/util/dbwriter.sh sgeadm 1422 1 0 May 10 ? 0:00 
>>>> /bin/sh /grid/dbwriter/util/dbwriter.sh postgres 1402 1401 0 May 10 
>>>> ? 0:00 /usr/local/pgsql/bin/postmaster -D /usr/local/pgsql/database 
>>>> -S postgres 1403 1402 0 May 10 ? 0:01 
>>>> /usr/local/pgsql/bin/postmaster -D /usr/local/pgsql/database -S 
>>>> postgres 1401 1 0 May 10 ? 0:04 /usr/local/pgsql/bin/postmaster -D 
>>>> /usr/local/pgsql/database -S postgres 13303 1401 0 16:29:34 ? 0:00 
>>>> /usr/local/pgsql/bin/postmaster -D /usr/local/pgsql/database -S 
>>>> postgres 9719 1401 0 14:31:33 ? 0:20 /usr/local/pgsql/bin/postmaster
>>>>         
>
>   
>>>> -D /usr/local/pgsql/database -S
>>>>
>>>> If you've any ideas please get back to me & I'll give you more 
>>>> detailed info.
>>>>
>>>> Best Regards
>>>>
>>>> John
>>>> */ John Mc Nicholas /*
>>>>
>>>> * STE/SEA Support Engineer *
>>>> * BETE Test Plants UK *
>>>> E
>>>>
>>>> Phone: +44 (0) 1483 305458
>>>> Email: john.xj.mc-nicholas at ericsson.com
>>>> Address: Ericsson, Midleton Gate, Guildford Business Park, 
>>>> Guildford, Surrey, GU2 8SG , UK
>>>>
>>>> / Ericsson Limited /
>>>> / Registered Office: Unit 4, Midleton Gate, Guildford Business Park,
>>>>         
>
>   
>>>> Guildford, Surrey, GU2 8SG / / Registered Number in England and 
>>>> Wales: 942215 / / This communication is confidential and intended 
>>>> solely for the addressee(s). Any unauthorised review, use, 
>>>> disclosure or distribution is prohibited. If you believe this 
>>>> message has been sent to you in error, please notify the sender by 
>>>> replying to this transmission and delete the message without 
>>>> disclosing it. Thank you.
>>>> Ericsson Limited does not enter into contracts or contractual 
>>>> obligations via electronic mail, unless otherwise agreed in writing 
>>>> between the parties concerned.
>>>> E-mail including attachments is susceptible to data corruption, 
>>>> interruption, unauthorised amendment, tampering and viruses, and we 
>>>> only send and receive e-mails on the basis that we are not liable 
>>>> for any such corruption, interception, amendment, tampering or 
>>>> viruses or any consequences thereof. /
>>>>
>>>>
>>>>
>>>>         
>>> ---------------------------------------------------------------------
>>> ---
>>>
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
>>> For additional commands, e-mail: users-help at gridengine.sunsource.net
>>>   
>>>       
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
>> For additional commands, e-mail: users-help at gridengine.sunsource.net
>>
>>     
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
> For additional commands, e-mail: users-help at gridengine.sunsource.net
>
>   



    [ Part 2: "Attached Text" ]

    [ The following text is in the "iso-8859-1" character set. ]
    [ Your display is set for the "ISO-8859-10" character set.  ]
    [ Some special characters may be displayed incorrectly. ]

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
For additional commands, e-mail: users-help at gridengine.sunsource.net



More information about the gridengine-users mailing list