[GE users] Accounting

Fred L Youhanaie fly at anydata.co.uk
Tue Apr 19 10:24:02 BST 2005

    [ The following text is in the "ISO-8859-1" character set. ]
    [ Your display is set for the "ISO-8859-10" character set.  ]
    [ Some special characters may be displayed incorrectly. ]

Hi Shaila,

A few years ago I did something similar for our, PostgreSQL, database. 
Because the raw data is in CPUs per job format you will need to do some 

The way I did it was to get the original data from the database table(s) 
using perl (DBI) and then converting it to a time series, where for each 
job record of the form <stime,etime,cpus> a number of new records would 
be generated, <stime,cpus> <stime+t,cpus> <stime+2t,cpus> ..., 
stime/etime are the job start_time and end_time, cpus are the number of 
slots allocated to that job and t is some suitable time period, e.g. 600s.

You would need to make sure that stime/etime for all jobs are aligned to 
the same time period boundary, for example for t=600 we would have 0, 
10, 20, 40 and 50 past the hour for all jobs. You would also need to 
take care of the case where the entire job is within a single period, 
e.g. 9:03-9:08. For this reason I allowed cpu count to have fractions.

Once you have these slices for all the jobs, then it is just a matter of 
  taking the collection of <time,cpus> for all the jobs then summing up 
the cpus where the time value is the same.

If you replace <stime,etime,cpus> with <stime,etime,1>, you can plot the 
number of active jobs against time. And <qtime,stime,1>, where qtime is 
submit_time, will let you plot the number of waiting jobs against time. 
So, it is worthwhile spending some time on the first part of the script, 
because once you have that you plot all sorts of things against time.

I no longer have access to my old scripts so I cannot give you a copy, 
sorry :(

Good luck


Shaila Parashar wrote:
> Hi
> We have a cluster consisting of 4 nodes  running SGE 6.0u3 .  The number 
> of CPUs on these nodes are 24, 12, 8 and 4 . We have been running this 
> cluster since Aug 2004 and are now in need of some statistics. I read 
> the mailing lists and did get some ideas about the statistics. I also 
> imported the accounting file into MySQL . I did manage to get statistics 
> on the jobs - as average CPU time used, average waiting time, etc. But 
> we need statistics similar to the following :-
> Number of CPU's used versus time. Also number oc CPUs used on each of 
> the hosts vs time.
> Basically we need plots vs time.
> I wanted to know if this is possible from the SGE accounting files and 
> if so how ?
> I would appreciate it if  you can suggest of any other values that we 
> can plot against time.
> Any ideas/suggestions on how to get these values ( if possible ) will be 
> really appreciated.
> Thanks
> Shaila

To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
For additional commands, e-mail: users-help at gridengine.sunsource.net

More information about the gridengine-users mailing list