[GE users] Correct accounting with mpich-mx tight integration

Chris Rudge chris.rudge at astro.le.ac.uk
Thu May 10 17:49:22 BST 2007


Yes, I do mean the time reported by the qacct but I think that the problem
is actually in the way SGE does accounting and not the parallel efficiency
of the code.

For example, if I run a 4 cpu MPI job for 1 hour (walltime) and then do
	Qacct -j <jobid>

What I'd expect to see is a walltime = 1 hour and a cputime <= 4 hours
(depending on parallel efficiency).

However, with the tight integration of mpich-mx what I'd actually see is 5
individual sections in the qacct report for this job. All would have a
walltime of 1 hour, four of them would have a cputime of something up to 1
hour and the other would have a cputime of 0. As far as I can tell these
sections in the report are accumulated into a total accounting record of
  Walltime = 5 hours and cputime <= 4 hours

I fully understand why it reports these times but I can't see any reason why
reporting this for the walltime could possibly considered the right thing to
do. I use PBSPro with mpich-gm on another cluster and the walltime report on
that system for the same job would be the expected 1 hour.

Regards,
Chris


> 
> you mean the accumulation in the qacct command? The problem with the
> wallclock will hit you in different ways: e.g. Gaussian is not
> computing all steps in parallel, although the slots are reserverd for
> you in the cluster. Simple approach to deal with that is to use the
> number of granted slots multiplied by the wallclock time of the
> master job with a small script.
> 
> -- Reuti
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
> For additional commands, e-mail: users-help at gridengine.sunsource.net


---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
For additional commands, e-mail: users-help at gridengine.sunsource.net




More information about the gridengine-users mailing list