[GE users] Using cycles from a 2nd SGE cluster
richard.hierlmeier at sun.com
Fri Jan 8 13:26:10 GMT 2010
I mostly agree with you if both clusters are geographically separated. You can
not solve problems caused by physical constraints with a peace of software.
These problems will also occur for a single cluster with geographically
separated compute node.
However for problems like different funding or fundamental different
configuration of the clusters I do not see any reason why multi clustering
should not work.
> I see this a lot in my consulting work - the "multi cluster" request
> usually comes from top level management who've been reading far too much
> about grids and clouds. They just think it would be cool to "unify" the
> various HPC systems in the organization and blindly issue the order to
> look into it.
> I'll give you the cynical industry answer ...
> Yes it's technically possible via these methods:
> (1) transfer queues
> (2) suicidal rampage down the globus/meta-scheduler route
> (3) Sun SDM
> ... but I've personally never seen this really ever be successful in a
> commercial/industry production computing environment that is not
> academic in nature or funded by defense/sovereign nation dollars.
> The only working systems I've seen have been at academic sites with
> *tons* of sysadmin resources or toy/demo/playground setups purpose built
> for demonstration purposes.
> So the technical answer in my option is "yes" but the practical answer
> in real world environments is usually "no". It is 100x harder when the
> two systems are geographically separated or have separate filesystem and
> UID/GID namespaces as well. Just an utter nightmare and the level of
> abstraction and wrapping needed to get anything done removes any
> efficiencies gained.
> This is not the answer you want to hear but I'd recommend tackling the
> political problems first to see if they can be addressed.
> In the real world the most practical solution I've seen is that the two
> groups agree to keep operating separate systems but when the next
> upgrade/refresh period rolls around they get together, do some serious
> planning and then roll out a new single unified HPC system that everyone
> is happy to share.
> In other projects the clusters have been relocated or rearchitected to
> either share the same datacenter or at least the same identity server
> and subnets so that future collaboration is easier.
> From an IT or management perspective I also see a lot of cases where
> central IT will build a big new cluster from scratch in order to tease
> or lure the standalone cluster crowd onto their shared system. This can
> be a multi-year task but the end result is that if you build a better
> centralized resource and make it available you'll often be able to
> consolidate and retire the smaller systems without anger or political
> Just my $.02
> rhierlmeier wrote:
>> My sys admin has been trying to configure two independent, linux clusters with static SGE pools, such that when the first cluster batch queue fills, additional jobs will fall over to a low priority queue in the second cluster. Each cluster has its own master node, and it would be a political non-starter to change that. So far, my admin has not succeeded.
>> Is his configuration with static pools workable?
>> If so, we would welcome some guidance in configuring our SGE deployment to do this.
>> We are beginning to wonder whether this is undoable with static pools, and need to switch to a dynamic pool.
>> Input would be most welcome. Thanks! -Joe
> To unsubscribe from this discussion, e-mail: [users-unsubscribe at gridengine.sunsource.net].
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
Richard Hierlmeier Phone: ++49 (0)941 3075-223
Software Engineering Fax: ++49 (0)941 3075-222
Sun Microsystems GmbH
Dr.-Leo-Ritter-Str. 7 mailto: richard.hierlmeier at sun.com
D-93049 Regensburg http://www.sun.com/grid
Sitz der Gesellschaft:
Sun Microsystems GmbH, Sonnenallee 1, D-85551 Kirchheim-Heimstetten
Amtsgericht Muenchen: HRB 161028
Geschaeftsfuehrer: Thomas Schroeder, Wolfgang Engels, Dr. Roland Boemer
Vorsitzender des Aufsichtsrates: Martin Haering
To unsubscribe from this discussion, e-mail: [users-unsubscribe at gridengine.sunsource.net].
More information about the gridengine-users