[GE users] cputime for parallel jobs does not reflect the number of cpus used

Nicolas Bock nbock at lanl.gov
Thu Mar 2 15:48:45 GMT 2006


On Mar 2, 2006, at 8:42 AM, Craig Tierney wrote:

> Sebastian Stark wrote:
>> On 02.03.2006, at 16:03, Rayson Ho wrote:
>>> Either follow Ron's suggestion [1] or you can use the wallclock time
>>> of the parallel job (wallclock * no. of CPU) for sharetree CPUtime
>>> calculation.
>> Which seems bad to me because wallclock also includes the time a  
>> job was suspended.
>>
>
> This is only an issue if jobs get suspended.  For the system I work  
> with, MPI jobs would die if they were suspended (can't suspend the
> open ports).

I agree with Craig. We don't suspend any jobs here since the cluster  
is purely for heavy calculations so it wouldn't make any sense to  
suspend anything. It seems to me that I also would prefer  
walltime*NCPU as the accounting measure: I don't care whether a user  
uses efficient parallel code which has a high CPU percentage. What I  
care about is how long that user blocks the cluster with his job.

nick


---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
For additional commands, e-mail: users-help at gridengine.sunsource.net




More information about the gridengine-users mailing list