[GE users] Installing and Configuring SGE (as execute-host) on Diskless-Cluster Nodes

fx d.love at liverpool.ac.uk
Wed Apr 14 18:03:51 BST 2010


    [ The following text is in the "utf-8" character set. ]
    [ Your display is set for the "ISO-8859-10" character set.  ]
    [ Some characters may be displayed incorrectly. ]

templedf <daniel.templeton at oracle.com> writes:

> Notice only adding the RC script requires modifying the local file 
> system.  So, as long as your cell/common directory is on a shared file 
> system, which is the assumed configuration, then all you really have to 
> worry about are the binaries and getting the RC script set up.  You 
> probably don't want to have the binaries on the shared file system, 
> especially if you're running large parallel jobs.  If you have a 
> smallish cluster, though, it might be OK.

I don't understand the normal advice I normally see not to share SGE
across the nodes on what I think of as typical-sized clusters -- it
seems much more manageable and without performance problems in our sort
of situation.

For reference, we have a stateless ?100-node cluster with fairly low
throughput and mainly ?10-node parallel jobs, with occasional larger
ones.  The node image is currently on the head via 1GbE, though it
should be on a Sun Thumper via 10GbE, like the SGE and OpenMPI
installations.  We haven't seen any issues with this to cause us to
worry.  If we do see them, I'll look at caching files on the nodes'
local disk.  There are some gotchas with the stateless nodes, but I'm
glad I moved from the previous stateful installation with just a shared
/opt/sge.  For what it's worth, it was originally set up with the Sun
HPC tools, but I dispensed with Cobbler and the clunky cluster database
stuff which glues it together.

By the way, I can supply an SGE SRPM done for a shared /opt/sge (unlike
the Fedora one) though I intended to tidy it up before advertising it.
The RPM in the Sun HPC tools isn't current and doesn't have a free
licence as far as I can tell.

-- 
Dave Love
?E-Science?, Computing Services Department, University of Liverpool
AKA fx at gnu.org

------------------------------------------------------
http://gridengine.sunsource.net/ds/viewMessage.do?dsForumId=38&dsMessageId=253409

To unsubscribe from this discussion, e-mail: [users-unsubscribe at gridengine.sunsource.net].



More information about the gridengine-users mailing list