[GE users] Grid Installation for firs time
Dan.Gruhn at Group-W-Inc.com
Thu Mar 10 15:31:52 GMT 2005
My only experience is with using NFS for $SGE_ROOT. If you cannot,
trying reading this HowTo:
On Thu, 2005-03-10 at 10:25, Walter Faleiro wrote:
> Hello Dan,
> I have updated the services file on the host partition as well. do i
> need to have common $SGE_ROOT. I tried to have a $SGE_ROOT as a NFS
> partition mounted on the execution host. But it gave me premission
> errors while installing.
> My OS is linux Redhat Version 9. And processor is Intel PIV 2.4Ghz.
> -----Original Message-----
> From: Dan Gruhn [mailto:Dan.Gruhn at Group-W-Inc.com]
> Sent: Thursday, March 10, 2005 7:18 AM
> To: users at gridengine.sunsource.net
> Subject: Re: [GE users] Grid Installation for firs time
> Did you update the services file on the "other host"? That
> is, the machine that is not the queue master?
> Also, what is your O/S and processor, etc.
> On Thu, 2005-03-10 at 10:12, Walter Faleiro wrote:
> > Hello Grid Users,
> > I am trying to install Grid 6.0u3 on two linux machines. I
> > have followed the following procedure.
> > Untar the grid-common and grid-bin files.
> > add sge_qmaster 536/tcp
> > sge_execd 537/tcp
> > to the services file.
> > install_qmaster and install_execd scripts.
> > My machine is configured for queue master, administrative
> > host, execute host and submit hosts.
> > I need some help in installing the install_execd on the
> > other machine. when i run the script it exits saying
> > quemaster intsallation is not done. Do i need to install the
> > quemaster on the execution hosts as well. I followed the
> > documentation on the sun docs, and nowhere it mentions
> > installing the Queuemaster on the execution hosts.
> > Thanks Walter.
> > -----Original
> > [Walter Faleiro]
> > Message-----
> > From: McCalla, Mac [mailto:macmccalla at hess.com]
> > Sent: Thursday, March 10, 2005 6:43 AM
> > To: users at gridengine.sunsource.net
> > Subject: [GE users] sge v6.0u3 new installation
> > issue with more than 1021 hosts.
> > Hi folks,
> > First my environment is all redhat EL WS or ES 3, on
> > dual xeon's.
> > i am moving my production grid from sge 5.3p6 to sge
> > 6.0u3 . the 5.3 installation is supporting about
> > 900 hosts at this time.
> > the 6.0u3 system has been installed and running for
> > a couple of weeks now in test mode supporting the
> > same 890 hosts
> > and seemed to be ok. I have been adding some new
> > hosts that are being installed as they become
> > available to only
> > the 6.0u3 system. yesterday, when the number of
> > hosts actually connected by execd passed from 1021
> > to 1022,
> > i noticed that qmaster stopped responding on port
> > 538 to any further requests from additional execd's
> > or commands (qstat,qhost
> > ,etc). the ulimit for fd's is set at 4096 at
> > qmaster startup (the info message at qmaster startup
> > says qmaster will use 4076 file
> > descriptors for communication). Has anyone else see
> > this problem or have a 6.0u3 installation with more
> > hosts?
> > thanks in advance,
> > Mac McCalla
More information about the gridengine-users