[GE users] Moving from "proof-of-concept" to "production" installation

Reuti reuti at staff.uni-marburg.de
Fri Nov 9 16:39:34 GMT 2007


Am 09.11.2007 um 17:04 schrieb GARDAIS Ionel:
> it's been a year since we are testing SGE to manage our serial and  
> parallel jobs.
> As SGE meets most (not to say all) of our requirements, we will  
> soon move from a "proof-of-concept" to a "production" installation.
> I mean : we are using a mix of workstations and dedicated computers  
> as exec nodes with a single master server, everything in the same  
> network.
> This is what I consider a "proof-of-concept" or "test" installation.
> The futur installation will be made of dedicated nodes only, all  
> connected to a dedicated switch, dual-powered (one on UPS not the  
> other) and so on.
> Master will be shadowed.
> Nodes will have two network cards.
> I plan to use different networks (not forwarded) : one for storage  
> access (NetApp) and the other for MPI communication.
> My question is : are there any known issues, common  
> misconfigurations or common mistakes that you are aware of in this  
> kind of setup ?
to keep some parallel libs happy: they should use the first interface  
(as SGE should do), the NFS can then use the second one.

This way you don't have to twist any parallel lib to use the second  
interface and get the correct pe_hostfile always without modification  
from SGE.

If the headnode has it's main interface connected to the outside  
world, you can use a host_aliases file in $SGE_ROOT/default/common to  
instruct SGE to use on this machine the second or third interface for  
its operation instead.


-- Reuti

More information about the gridengine-users mailing list