[GE users] Accounting

Shaila Parashar shaila at engr.colostate.edu
Thu Apr 21 17:34:30 BST 2005


Right now I have installed ganglia and also we are using the SRS 
software to monitor the server . Since these nodes are only used for 
grid jobs, I am assuming that these 2 might help. We are still looking 
into them. If they don't do  what we want - then the next resort will be 
similar to what you suggested.
Thanks a lot for your suggestion.

Shaila


Fred L Youhanaie wrote:

>
> Hi Shaila,
>
> A few years ago I did something similar for our, PostgreSQL, database. 
> Because the raw data is in CPUs per job format you will need to do 
> some manipulation.
>
> The way I did it was to get the original data from the database 
> table(s) using perl (DBI) and then converting it to a time series, 
> where for each job record of the form <stime,etime,cpus> a number of 
> new records would be generated, <stime,cpus> <stime+t,cpus> 
> <stime+2t,cpus> ..., stime/etime are the job start_time and end_time, 
> cpus are the number of slots allocated to that job and t is some 
> suitable time period, e.g. 600s.
>
> You would need to make sure that stime/etime for all jobs are aligned 
> to the same time period boundary, for example for t=600 we would have 
> 0, 10, 20, 40 and 50 past the hour for all jobs. You would also need 
> to take care of the case where the entire job is within a single 
> period, e.g. 9:03-9:08. For this reason I allowed cpu count to have 
> fractions.
>
> Once you have these slices for all the jobs, then it is just a matter 
> of  taking the collection of <time,cpus> for all the jobs then summing 
> up the cpus where the time value is the same.
>
> If you replace <stime,etime,cpus> with <stime,etime,1>, you can plot 
> the number of active jobs against time. And <qtime,stime,1>, where 
> qtime is submit_time, will let you plot the number of waiting jobs 
> against time. So, it is worthwhile spending some time on the first 
> part of the script, because once you have that you plot all sorts of 
> things against time.
>
> I no longer have access to my old scripts so I cannot give you a copy, 
> sorry :(
>
> Good luck
>
> f.
>
>
>
>
> Shaila Parashar wrote:
>
>> Hi
>>
>> We have a cluster consisting of 4 nodes  running SGE 6.0u3 .  The 
>> number of CPUs on these nodes are 24, 12, 8 and 4 . We have been 
>> running this cluster since Aug 2004 and are now in need of some 
>> statistics. I read the mailing lists and did get some ideas about the 
>> statistics. I also imported the accounting file into MySQL . I did 
>> manage to get statistics on the jobs - as average CPU time used, 
>> average waiting time, etc. But we need statistics similar to the 
>> following :-
>>
>> Number of CPU's used versus time. Also number oc CPUs used on each of 
>> the hosts vs time.
>>
>> Basically we need plots vs time.
>>
>> I wanted to know if this is possible from the SGE accounting files 
>> and if so how ?
>>
>> I would appreciate it if  you can suggest of any other values that we 
>> can plot against time.
>>
>> Any ideas/suggestions on how to get these values ( if possible ) will 
>> be really appreciated.
>>
>> Thanks
>>
>> Shaila
>>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
> For additional commands, e-mail: users-help at gridengine.sunsource.net


-- 
*****************************************************************
Shaila Parashar			e-mail:shaila at engr.colostate.edu
UNIX System Administrator	tel:- (970)-491-6555
Engineering Network Services
Colorado State University
Fort Collins, CO 80523-1301
******************************************************************
" Smile is a curve that sets things straight. " 




---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
For additional commands, e-mail: users-help at gridengine.sunsource.net




More information about the gridengine-users mailing list