[GE users] Using cycles from a 2nd SGE cluster

beatrubi beat at 0x1b.ch
Fri Jan 8 16:49:44 GMT 2010


Quoting <dag at sonsorol.org> (08.01.10 14:39):

> Multi-clustering works from the perspective of sysadmins, HPC designers
> and senior management.

For simple serial jobs multicluster or grid environments may work quite
well. You have to take care that the data is on place and the user
information is shared in a convenient way. Beside the SDM feature there is
also the possibility to use transfer queues directly between the different
Grid Engine instances. There are some quite old examples in the web but I
see no reason why they shoulnd't work today.

The mess starts with parallel applications. A cluster means typically a
bunch of compute nodes with a high speed interconnect. You can't distribute
the children over two independend clusters. You can't move a node from one
cluster to the other.

Another problem may be the data. Jobs processing large amount of data may
use more time to copy the stuff from one system to the other as for the
calculation itself. You'll have to take the bandwith between the systems and
the amount of data used by a typical job into account.

Finally you have to take care that all the libraries are available. I see a
lot of problems when moving applications between different Linux
distributions. RHEL4->RHEL5, RHEL4->SLES10 works. RHEL5->SLES10 and
SLES10->RHEL5 gives you typically a segfault. Keeping the installation of
all datacenters in sync is the major problem for large grids.

Probably you have to look first to your application and the available
environment. As usual the answer to the inital question is "it depends" :-)


     \|/                           Beat Rubischon <beat at 0x1b.ch>
   ( 0^0 )                             http://www.0x1b.ch/~beat/
Meine Erlebnisse, Gedanken und Traeume: http://www.0x1b.ch/blog/


To unsubscribe from this discussion, e-mail: [users-unsubscribe at gridengine.sunsource.net].

More information about the gridengine-users mailing list