[GE users] Moving from "proof-of-concept" to "production" installation
reuti at staff.uni-marburg.de
Fri Nov 9 16:39:34 GMT 2007
Am 09.11.2007 um 17:04 schrieb GARDAIS Ionel:
> it's been a year since we are testing SGE to manage our serial and
> parallel jobs.
> As SGE meets most (not to say all) of our requirements, we will
> soon move from a "proof-of-concept" to a "production" installation.
> I mean : we are using a mix of workstations and dedicated computers
> as exec nodes with a single master server, everything in the same
> This is what I consider a "proof-of-concept" or "test" installation.
> The futur installation will be made of dedicated nodes only, all
> connected to a dedicated switch, dual-powered (one on UPS not the
> other) and so on.
> Master will be shadowed.
> Nodes will have two network cards.
> I plan to use different networks (not forwarded) : one for storage
> access (NetApp) and the other for MPI communication.
> My question is : are there any known issues, common
> misconfigurations or common mistakes that you are aware of in this
> kind of setup ?
to keep some parallel libs happy: they should use the first interface
(as SGE should do), the NFS can then use the second one.
This way you don't have to twist any parallel lib to use the second
interface and get the correct pe_hostfile always without modification
If the headnode has it's main interface connected to the outside
world, you can use a host_aliases file in $SGE_ROOT/default/common to
instruct SGE to use on this machine the second or third interface for
its operation instead.
More information about the gridengine-users