[GE users] array jobs mess up fair share ??

Chris Rudge chris.rudge at astro.le.ac.uk
Tue Jul 8 15:45:44 BST 2008


Andreas,
        
I assume that your specific question is about local spooling.
        
No, our spool directory is on the cluster head node. With only 66 nodes
(all with 4 cores) in the cluster and a low turnover of jobs I'd have
thought it unlikely that this causes a problem.
        
Clearly in the example I gave the discrepancy was so great that it looks
more likely to be an additional factor creeping into the cpu time
calculation rather than i/o delays here and there.

I say this as it's not obvious to me why there should be greater i/o
delays for array tasks than for any other jobs in our situation. If we
had hundreds, or thousands, of very short array tasks and much smaller
numbers of longer "normal" jobs then I can see how this could be an
issue.

However, our usage pattern is that there are 10's of array jobs tasks
and each task runs for around 1 day so the throughput of array tasks is
not significantly greater than other jobs. Moreover, if NFS i/o were a
bottleneck for this low level of job throughput then I'm sure we'd be
seeing problems due to this with other routine tasks as the SGE spool is
on /usr/local

The fair share usage reported by sge_share_mon was something like 25-30
times what it should be for the 10 array tasks currently running. Note,
further, that no array tasks started or stopped during the 15 second
sampling interval.

Regards,
Chris



On Tue, 2008-07-08 at 15:02 +0200, Andreas.Haas at Sun.COM wrote:
> Hi Chris,
> 
> I know this phenomenon, but unfortunately I can't explain it.
> 
> Please find my question on your setup in
> 
>     http://gridengine.sunsource.net/issues/show_bug.cgi?id=2298
> 
> my hope is that you and I could jointly encirle the problem step-by-step.
> 
> Regards,
> Andreas
> 

-- 
Dr Chris Rudge
chris.rudge at astro.le.ac.uk

Research Computing Manager
Dept of Physics & Astronomy
University of Leicester
LE1 7RH

web.  www.ukaff.ac.uk
Tel.  +44 (0)116 2523331
Fax.  +44 (0)116 2231283
Mob.  +44 (0)794 1379420


---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
For additional commands, e-mail: users-help at gridengine.sunsource.net




More information about the gridengine-users mailing list