[GE users] implementation opinion / suggestions

craffi dag at sonsorol.org
Thu Jan 28 11:43:19 GMT 2010

I can speak for a project I'm currently working on ...

jching wrote:
> Hi,
> We are currently in the process of planning for our next sge implementation and wanted to get the community's opinion on local bdb -vs- rpc bdb.  The setup will be ~2000 cores (500 nodes) with a combination of short and long jobs that will run in the queue<insert approximate # of jobs here>.
> After reviewing some of the valuable performance data provided by Mark Dixon in a previous post, it looks like there is a significant performance gain when running local bdb -vs- rpc/bdb but the rpc/bdb option gives us an additional failover option with the shadow master.  We would love to hear any opinions and/or experience people have... we also had a few questions for the large cluster (200+ nodes) community:
> 1. What is your implementation? (Local or Remote BDB w/ Shadow? Type of physical hardware?  Network?)

Classic spooling to a central NFS share; shadow master

No option for local spooling since nodes are diskless

HP dual-socket, quad-core Nehalem blades

Isilon Clustered NAS storage

> 2. How many nodes?

128 nodes / 1024 Nehalem cores*

    *system will double in size next week to 2048 cores

> 3. Types of jobs? (short or long period of runtime)

Mixture of both but biased towards small jobs. Many tens of thousands of 
jobs per day is the absolute norm. 100K jobs in a day is not unusual.

> 4. Any performance issues?

Some SGE/qmaster performance issues, particularly with minor qstat 
delays when there are many many thousands of active and pending jobs. 
This could be the qmaster hardware or an artifact of classic spooling, 
not sure.

I would never use classic spooling on a cluster this size if we did not 
have the Isilon storage gear. Every NFS client uses a different IP 
address to speak to the NFS server and the algorithm-based load 
balancing is done via using arp calls to migrate the NFS IP endpoints 
between Isilon storage nodes as needed. The end result is we can drive 
24 GigE ports at wire speed while servicing a single-namespace NFS 
volume. Works great.

Would not recommend 1024 or 2048 cluster with classic spooling and 
diskless nodes *unless* a really fast shared storage tier is in place.

There is a secondary tier of NetApp storage for users, applications and 
result data.

> 5. Do you run DRBD?

Nope. classic spooling.


To unsubscribe from this discussion, e-mail: [users-unsubscribe at gridengine.sunsource.net].

More information about the gridengine-users mailing list