[GE users] Monitoring Softwares...

Mark Westwood mark.westwood at ohmsurveys.com
Fri Mar 18 17:05:48 GMT 2005


Sriram

FWIW ...

We simply load the accounting file into a MySQL database, currently once 
a month but that's only because I only do the management reports once a 
month, there's no good reason why we couldn't do it more or less frequently.

Once the accounting data is in MySQL its very easy to get out the sort 
of information you want.  I have some sql scripts which prepare tables 
of data for the mgmt reports, and use Matlab for the graphics - but 
Excel would do just as well.

I wrote a shell script to read through the jobs in any period and 
determine how many CPUs were in use at any time.  That was the hardest 
part of the set up.  The 'logic' is easy enough, but it doesn't 
translate very easily into SQL.  The script does something like:

- select all jobs whose execution time overlaps with the period of 
interest which might be a day, a month, a week, even an hour;

- decide the sampling interval; for monthly reports the sampling 
interval is hourly; but the script can sample second-by-second

- for each sampling interval, for each job (this is a horrible hack but 
it works) if the job end time is later than the start of the sampling 
interval, add one to the total number of CPUs in use that hour;

- keep looping until the analysis is complete, file the data and plot a 
graph of hour-by-hour usage for the month.

Hope this is some use to you.  I guess you could load the data directly 
into Excel or Matlab or whatever your favourite analysis package is, but 
a database gives you a lot of flexibility.  I have tables in the 
database for clients and projects so that my reports can show how much 
usage we're making of the cluster for each client, each user, that sort 
of thing.

Regards
Mark

Sriram Sitaraman wrote:
> Hi
> 
> 	Seems like this question has come up a few time with no real
> good solution. Is there "SGE" related monitoring system that
> consolidates some important values like
> 
> 	Machine load
> 	Machine CPU %
> 	Mem_Total
> 	Mem_Free
> 
> 	Jobs Submitted
> 	CPU/ Per User
> 	Jobs Pending
> 	Average Turn around Time / Average Wait Time/ Average Run Time 
> 	Job based timings
> 	
> 	Idle Jobs
> 	Job Jobs
> 
> We have been working on a interface, but managing the accounting file is
> very hard, as it grows very fast. Also we are on version 6.0. Currently
> some of the systems out there seem to be more cluster centric, but not
> related to SGE. 
> 
> Any help..
> 
> Sriram
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
> For additional commands, e-mail: users-help at gridengine.sunsource.net
> 
> 
> 
> 

-- 
Mark Westwood
Parallel Programmer
OHM Ltd
The Technology Centre
Offshore Technology Park
Claymore Drive
Aberdeen
AB23 8GD
United Kingdom

+44 (0)870 429 6586
www.ohmsurveys.com

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
For additional commands, e-mail: users-help at gridengine.sunsource.net




More information about the gridengine-users mailing list