[GE users] How Many Resources Is Too Many?

craffi dag at sonsorol.org
Fri May 8 21:51:27 BST 2009

I know of a company that does this with SGE, not sure if it is  
thousands or hundreds of thousands of complex entries and resources.  
I'll ask them if they are willing to speak about their experiences.


On May 8, 2009, at 4:05 PM, templedf wrote:

> I'm working on a deep integration between Hadoop and SGE, and that
> requires SGE to be able to schedule against the HDFS data.  The most
> effective way to do that that I have come up with is to model the HDFS
> data blocks as boolean resources reported by the execd's.  Effective,
> but not efficient.  The problem is that this approach will result in  
> at
> least one such resource for every file in the HDFS, more for large
> files.  For a large file system, that could mean 1000's, maybe 10's  
> (or
> even 100's) of thousands, of resources, with each host being assigned
> 100's or 1000's.  Based on previous customer experiences, I'd say  
> that's
> a really bad idea, but I thought I'd check to see what experience  
> others
> have had with massive numbers of resources.  Anyone want to share?
> Anyone (Roland) want to suggest what the practical upper bound on  
> number
> of resources should be?
> (Anyone want to suggest an alternative approach?  I have plans B  
> through
> E, but I'm certainly open to input.)
> Thanks,
> Daniel


To unsubscribe from this discussion, e-mail: [users-unsubscribe at gridengine.sunsource.net].

More information about the gridengine-users mailing list