[GE users] Array job accounting bug fix?
jlb at salilab.org
Tue Apr 8 00:48:10 BST 2008
On Tue, 25 Mar 2008 at 2:01pm, Joshua Baker-LePain wrote
> On Fri, 14 Mar 2008 at 7:47am, Andreas Haas wrote
>> Am 13.03.2008 um 22:34 schrieb Joshua Baker-LePain:
>>> Does anyone know whether or not it will fix Issue 2298? This bug is
>>> really wreaking havoc on our fairshare setup (and our qmaster, as folks
>>> avoid the massive fairshare penalties by submitting hundreds/thousands of
>>> jobs instead of using an array job).
>> I looked into #2298, but was not able to find the root-cause for the
>> higher resource utilization on avg from array jobs. What I can say is that
>> already the total runtime according accounting(5) is higher for array tasks
>> on average. Resource utilization is important as it is finally the key
>> factor with
>> fairshare policy.
> I finally had the opportunity to do some testing on my production cluster.
> When submitting array jobs, I consistently see utilization numbers ~22%
> higher than the equivalent number of individually submitted jobs. I don't
> consider that slightly higher.
Further observation -- the CPU usage reported by qacct is essentially
equal for array jobs and equivalent numbers of individual jobs. In other
words, 'ltcpu' as reported by sge_share_mon differs from 'CPU' as reported
by qacct. Does that help narrow down where this bug may be at all?
Also, if I switch to the functional share policy, then array jobs are
scheduled with priority equal to that of individually submitted jobs.
Given the heavy usage of array jobs by some groups here (heartily
encouraged by myself, of course, for the sake of the queue master) and
the severe penalties they incur for it, should I just plan a migration to
the functional share policy?
QB3 Shared Cluster Sysadmin
To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
For additional commands, e-mail: users-help at gridengine.sunsource.net
More information about the gridengine-users