[GE users] Sun Grid Engine 6.2 - ARCO dbwriter issue

Karen Magee magee at mayo.edu
Fri Sep 5 21:38:39 BST 2008


See inline ..
-----------------
On Fri, Sep 05, 2008 at 07:00:17PM +0200, Jana Olivova wrote:
> >>Is there anything useful in the dbwriter log file? 
> >>($SGE_ROOT/$SGE_CELL/spool/dbwriter/dbwriter.log), any errors, exceptions?
> >>
> >>    
> >unfortunately, no...just was looks to me to be normal successful startup...
> >
> >04/09/2008 
> >11:14:06|dnode0-bkp.mayo.edu|.ReportingDBWriter.initLogging|I|Starting up 
> >dbwriter (Version 6.2) ---------------------------
> >04/09/2008 
> >11:14:06|dnode0-bkp.mayo.edu|r.ReportingDBWriter.initialize|I|Connection 
> >to db jdbc:mysql://rcfclusterdb.mayo.edu:3306/arco
> >04/09/2008 
> >11:14:06|dnode0-bkp.mayo.edu|r.ReportingDBWriter.initialize|I|Found 
> >database model version 8
> >04/09/2008 
> >11:14:07|dnode0-bkp.mayo.edu|er.file.FileParser.processFile|I|Renaming 
> >reporting  to reporting.processing
> >04/09/2008 11:14:07|dnode0-bkp.mayo.edu|iter.file.FileParser.parseFile|W|0 
> >lines marked as erroneous, these will be skipped
> >04/09/2008 
> >11:14:07|dnode0-bkp.mayo.edu|tingDBWriter.getDbWriterConfig|I|calculation 
> >file /home/sge6_2/dbwriter/database/mysql/dbwriter.xml has changed, reread 
> >it
> >04/09/2008 
> >11:14:13|dnode0-bkp.mayo.edu|ngDBWriter$StatisticThread.run|I|Next 
> >statistic calculation will be done at 9/4/08 12:14 PM
> >04/09/2008 
> >11:14:31|dnode0-bkp.mayo.edu|rtingDBWriter.logEventDuration|I|calculating 
> >derived values took 0 hours 0 minutes
> >  
> Is this the end of the log, or is there more and you had just copied the 
> snipplet? There should be also lines  in the log file about the deletion 
> time 'deleting outdated values took X hours X minutes. Are these 
> messages there? Yo can also check the in ARCo web console the 
> Performance Query, which also shows the same information. Are there any 
> lines in the log that say 'processed X   lines in X minutes' ?

..that is the entire log ...
The dbwriter query yields no fields with 'processed X   lines in X minutes'

> >  
> >>How have you figured out that it is taking a long time to remove the 
> >>old  records?
> >>
> >>    
> >just a guess by looking at the size of (count) of sge_host_values table
> >It's decreasing...and the select command that PHP MySQL sees has that table
> >in it..After running overnight, we've see a drop of 287,000 records in
> >the sge_host_values record count...but I haven't seen any of the "new"
> >data from jobs that have run in the last week or so show up yet..
> >  
> This is not an indication that anything is wrong. In the deletion rules 
> file the deletion for some host_values is set to 7 days:  So, if the 
> dbwriter was not running fro some 25 days it would after, restart delete 
> lot of records at once after restart. See: 
> http://wikis.sun.com/display/GridEngine/Derived+Values+and+Deletion+Rules. 
> Are  there no new data being inserted in *any* tables? Check sge_job table.
> >  

Nothing going into the sge_job table - But I I shouldn't expect it if
it's still doing the cleanup before it starts looking at the reporting file...

right?

> >-rw-r--r--  1 sgeadmin sgeadmin  27109033 Sep  5 09:50 reporting
> >-rw-r--r--  1 sgeadmin sgeadmin 360673112 Sep  4 11:13 reporting.processing
> >
> >[root at dnode0 common]# wc -l reporting*
> >   235247 reporting
> >  3138846 reporting.processing
> >
> >I'm concerned that I'm in a vicious cycle with the reporting file growing 
> >and
> >the reporting.processing file not finishing up.....and when it does we'll
> >be back to the same thing again..
> >  
> Hmm, if the log that you have showed me is the whole log, then it looks 
> like dbwriter is stuck somewhere. Try stopping the dbwriter, increase 
> the Debug level in dbwriter.conf file, and start again, see if there is 
> anything else in  the log.
> 
I've restarted with debugging...it's working on stuff...this query over and
over...about 1.5 minutes a piece..

05/09/2008 15:33:07|dnode0-bkp.mayo.edu|riter.db.Database.executeQuery|D|Execute sql: SELECT hv_id FROM sge_host_values WHERE hv_time_end < {ts '2008-08-15 11:00:00.0'} AND hv_variable IN ('np_load_avg', 'cpu', 'mem_free', 'virtual_free') limit 500
05/09/2008 15:34:35|dnode0-bkp.mayo.edu|iter.db.Database.executeUpdate|D|Execute sql: DELETE FROM sge_host_values WHERE hv_id IN (115390991,115390993,115390997,115391000,115391004,115391006,115391010,115391012,115391016,115391020,115391024,115391028,115391032,115391037,115391041,115391043,115391047,115391050,115391054,115391057,115391061,115391063,115391067,115391070,115391074,115391076,115391080,115391082,115391086,115391088,115391092,115391097,115391101,115391103,115391107,115391109,115391113,115391115,115391119,115391121,115391125,115391127,115391131,115391135,115391139,115391141,115391145,115391147,115391151,115391153,115391157,115391160,115391164,115391166,115391170,115391174,115391178,115391180,115391184,115391187,115391191,115391194,115391198,115391200,115391204,115391206,115391210,115391213,115391217,115391219,115391223,115391225,115391229,115391233,115391237,115391240,115391244,115391246,115391250,115391253,115391257,115391260,115391264,115391266,115391270,115391272,115391276,115391278,115391282,115391284,115391288,115391290,115391294,115391297,115391301,115391303,115391307,115391309,115391313,115391316,115391320,115391322,115391326,115391328,115391332,115391334,115391338,115391340,115391344,115391346,115391350,115391353,115391355,115391357,115391361,115391363,115391367,115391369,115391373,115391375,115391379,115391381,115391385,115391387,115391391,115391393,115391397,115391399,115391403,115391405,115391409,115391412,115391416,115391418,115391422,115391424,115391428,115391430,115391434,115391437,115391441,115391443,115391447,115391449,115391453,115391455,115391459,115391461,115391465,115391467,115391471,115391473,115391477,115391479,115391483,115391485,115391489,115391491,115391495,115391497,115391501,115391503,115391507,115391510,115391514,115391516,115391520,115391526,115391530,115391534,115391538,115391541,115391545,115391547,115391551,115391554,115391558,115391561,115391565,115391567,115391571,115391574,115391578,115391580,115391584,115391586,115391590,115391592,115391596,115391601,115391605,115391607,115391611,115391613,115391617,115391619,115391623,115391625,115391629,115391631,115391635,115391639,115391643,115391645,115391649,115391651,115391655,115391657,115391661,115391664,115391668,115391671,115391675,115391677,115391681,115391683,115391687,115391691,115391695,115391697,115391701,115391703,115391707,115391710,115391714,115391716,115391720,115391723,115391727,115391729,115391733,115391737,115391741,115391744,115391748,115391750,115391754,115391756,115391760,115391763,115391767,115391770,115391774,115391776,115391780,115391782,115391786,115391789,115391793,115391795,115391799,115391801,115391805,115391807,115391811,115391813,115391817,115391820,115391824,115391826,115391830,115391832,115391836,115391838,115391842,115391844,115391848,115391850,115391854,115391858,115391860,115391861,115391865,115391867,115391871,115391873,115391877,115391879,115391883,115391885,115391889,115391891,115391895,115391897,115391901,115391903,115391907,115391909,115391913,115391916,115391920,115391922,115391926,115391928,115391932,115391934,115391938,115391941,115391945,115391947,115391951,115391953,115391957,115391959,115391963,115391965,115391969,115391971,115391975,115391977,115391981,115391983,115391987,115391989,115391993,115391995,115391999,115392001,115392005,115392007,115392011,115392014,115392018,115392020,115392024,115392030,115392034,115392038,115392042,115392045,115392049,115392051,115392055,115392059,115392063,115392065,115392069,115392071,115392075,115392077,115392081,115392084,115392088,115392090,115392094,115392096,115392100,115392104,115392108,115392111,115392115,115392117,115392121,115392123,115392127,115392130,115392134,115392136,115392140,115392143,115392147,115392149,115392153,115392155,115392159,115392161,115392165,115392169,115392173,115392175,115392179,115392181,115392185,115392187,115392191,115392195,115392199,115392201,115392205,115392207,115392211,115392214,115392218,115392220,115392224,115392227,115392231,115392233,115392237,115392241,115392245,115392248,115392252,115392255,115392259,115392261,115392265,115392267,115392271,115392274,115392278,115392280,115392284,115392286,115392290,115392293,115392297,115392299,115392303,115392305,115392309,115392311,115392315,115392317,115392321,115392324,115392328,115392330,115392334,115392336,115392340,115392342,115392346,115392348,115392352,115392354,115392358,115392362,115392364,115392365,115392369,115392371,115392375,115392377,115392381,115392383,115392387,115392389,115392393,115392395,115392399,115392401,115392405,115392407,115392411,115392413,115392417,115392419,115392423,115392425,115392429,115392432,115392436,115392438,115392442,115392445,115392449,115392451,115392455,115392457,115392461,115392463,115392467,115392469,115392473,115392475,115392479,115392481,115392485,115392487,115392491,115392493,115392497,115392499,115392503,115392505,115392509,115392512,115392516,115392518,115392522,115392524,115392528,115392534,115392538,115392543,115392547,115392549,115392553,115392555,115392559,115392563,115392567,115392570,115392574,115392576,115392580,115392582,115392586,115392588,0)
05/09/2008 15:34:35|dnode0-bkp.mayo.edu|ng.dbwriter.db.Database.commit|D|Thread derived commits Connection 2 (null at jdbc:mysql://rcfclusterdb.mayo.edu:3306/arco)
05/09/2008 15:34:35|dnode0-bkp.mayo.edu|riter.db.Database.executeQuery|D|Execute sql: SELECT hv_id FROM sge_host_values WHERE hv_time_end < {ts '2008-08-15 11:00:00.0'} AND hv_variable IN ('np_load_avg', 'cpu', 'mem_free', 'virtual_free') limit 500





-- 
-------
Karen Magee                 Unix Systems Coordinator - RCF
Mayo Clinic                 Internet: magee at mayo.edu
200 1st St SW               Phone: (507) 284-1806
Rochester, MN 55905

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
For additional commands, e-mail: users-help at gridengine.sunsource.net




More information about the gridengine-users mailing list