[GE users] SGE 6.2u5 Email Summary CPU time

reuti reuti at staff.uni-marburg.de
Fri Feb 26 12:00:12 GMT 2010


Am 25.02.2010 um 20:35 schrieb mhanby:

> Howdy,
> I ran a test running a 4 slot roughly 3 hour duration job on 6.2u5.  
> The OpenMPI pe has " accounting_summary TRUE ".
> The qacct output for the job correctly reports the cpu time as "  
> 42300.975", which is 11.75 hours, or roughly 2:56:00 hours per slot.
> The job email, however, reports CPU as 02:54:20 for the job.
> This should read something like 11:45:00
> The email results for max vmem, user time, system time etc... also  
> don't match up with the actual results listed in qacct
> I guess the email still only reflects the master process for the job.

correct, but you can enter an enhancement issue for it.

The emails are send from the nodes. But all qrsh tasks which ended  
before reported their usage to the qmaster, which will write the  
accounting record after some seconds when the job finished. Hence  
this final notification email would have to be send from the qmaster  

Rough idea for the qmaster as a workaround:

tail -F --follow=name -n 0 $SGE_ROOT/default/common/accounting 2>/dev/ 
null | my_mail_script &

and the my_mail_script is an endless loop which greps <job_id> and  
<user> out of the entry, then issues a `qacct -j <job_id>` and sends  
this well formatted output to the determined user (well, you can't  
use -M then in the future).

-- Reuti

> =================================
> Mike Hanby
> mhanby at uab.edu
> Information Systems Specialist II
> IT HPCS / Research Computing
> ------------------------------------------------------
> http://gridengine.sunsource.net/ds/viewMessage.do? 
> dsForumId=38&dsMessageId=246100
> To unsubscribe from this discussion, e-mail: [users- 
> unsubscribe at gridengine.sunsource.net].


To unsubscribe from this discussion, e-mail: [users-unsubscribe at gridengine.sunsource.net].

More information about the gridengine-users mailing list