[GE users] Grid Installation for firs time

Dan Gruhn Dan.Gruhn at Group-W-Inc.com
Thu Mar 10 15:31:52 GMT 2005


My only experience is with using NFS for $SGE_ROOT.  If you cannot,
trying reading this HowTo:

http://gridengine.sunsource.net/project/gridengine/howto/nfsreduce.html

Dan

On Thu, 2005-03-10 at 10:25, Walter Faleiro wrote:

> Hello Dan,
> I have updated the services file on the host partition as well. do i
> need to have common $SGE_ROOT. I tried to have a $SGE_ROOT as a NFS
> partition mounted on the execution host. But it gave me premission
> errors while installing.
>  
> My OS is linux Redhat Version 9. And processor is Intel PIV 2.4Ghz.
>  
>  
> Thanks
> Walter
> 
>         -----Original Message-----
>         From: Dan Gruhn [mailto:Dan.Gruhn at Group-W-Inc.com]
>         Sent: Thursday, March 10, 2005 7:18 AM
>         To: users at gridengine.sunsource.net
>         Subject: Re: [GE users] Grid Installation for firs time
>         
>         
>         Walter,
>         
>         Did you update the services file on the "other host"?  That
>         is, the machine that is not the queue master?
>         
>         Also, what is your O/S and processor, etc.
>         
>         Dan
>         
>         On Thu, 2005-03-10 at 10:12, Walter Faleiro wrote: 
>         
>         > Hello Grid Users,
>         > I am trying to  install Grid 6.0u3 on two linux machines. I
>         > have followed the following procedure.
>         >  
>         > Untar the grid-common and grid-bin files.
>         >  
>         > add sge_qmaster 536/tcp
>         >     sge_execd 537/tcp 
>         >  
>         > to the services file.
>         >  
>         >  install_qmaster and install_execd scripts.
>         >  
>         > My machine is configured for queue master, administrative
>         > host, execute host and submit hosts.
>         >  
>         > I need some help in installing the install_execd on the
>         > other machine. when i run the script it exits saying
>         > quemaster intsallation is not done. Do i need to install the
>         > quemaster on the execution hosts as well. I followed the
>         > documentation on the sun docs, and nowhere it mentions
>         > installing the Queuemaster on the execution hosts.
>         >  
>         >  
>         > Thanks Walter.
>         > 
>         >         -----Original
>         >         [Walter Faleiro] 
>         >           Message-----
>         >         From: McCalla, Mac [mailto:macmccalla at hess.com]
>         >         Sent: Thursday, March 10, 2005 6:43 AM
>         >         To: users at gridengine.sunsource.net
>         >         Subject: [GE users] sge v6.0u3 new installation
>         >         issue with more than 1021 hosts.
>         >         
>         >         
>         >         
>         >         
>         >         Hi folks,
>         >         
>         >         First my environment is all redhat EL WS or ES 3, on
>         >         dual xeon's.
>         >         i am moving my production grid from sge 5.3p6 to sge
>         >         6.0u3 .  the 5.3 installation is supporting about
>         >         900 hosts at this time.
>         >         
>         >         the 6.0u3 system has been installed and running for
>         >         a couple of weeks now in test mode supporting the
>         >         same 890 hosts
>         >         and seemed to be ok.  I have been adding some new
>         >         hosts that are being installed as they become
>         >         available to only
>         >         the 6.0u3 system.  yesterday, when the number of
>         >         hosts actually connected by execd passed from 1021
>         >         to 1022,
>         >         i noticed that qmaster stopped responding on port
>         >         538 to any further requests from additional execd's
>         >         or commands (qstat,qhost
>         >         
>         >         ,etc).   the ulimit for fd's is set at 4096 at
>         >         qmaster startup (the info message at qmaster startup
>         >         says qmaster will use 4076 file
>         >         
>         >         descriptors for communication).  Has anyone else see
>         >         this problem or have a 6.0u3 installation with more
>         >         hosts?  
>         >         
>         >         thanks in advance,
>         >         Mac McCalla
>         >         



More information about the gridengine-users mailing list