[GE users] Sun GridEngine execution hosts not finding config

Richard Hobbs richard.hobbs at crl.toshiba.co.uk
Thu Mar 30 12:31:08 BST 2006


Hello,

> -----Original Message-----
> From: Reuti [mailto:reuti at staff.uni-marburg.de] 
> Sent: 29 March 2006 22:19
> To: users at gridengine.sunsource.net
> Subject: Re: [GE users] Sun GridEngine execution hosts not 
> finding config
> 
> Hi,
> 
> Am 29.03.2006 um 13:52 schrieb Richard Hobbs:
> 
> > Hello,
> >
> > We have a number of Sun GridEngine execution hosts with qmaster on a
> > separate machine. Most of the 44 hosts work perfectly, but 
> a few of  
> > them are
> > reporting "local configuration host.domain not defined - 
> using global
> 
> do you need any local configuration at all? I remember this only for  
> 5.3, and as all nodes used the same configuration I simply ignored  
> this message, as it was intended to work this way.

No, I do not need any local configuration, and I have never set any up, but
the reason I am concerned is that after printing this error message to the
screen, "qmon" shows all queues on that host as red. Not disabled, or alarm,
just red with no disabled/suspended/alarm status bar at the bottom of each
queue at all.

> Which SGE version are you using?

5.3p6

> What is qconf -sconfl saying?

'qconf -sconfl' is reporting a list of hosts. However, the hosts that are
reporting "no local config" are not in the list. The hosts that are "broken"
are on the network though, and they are able to ping the qmaster, and also
to mount the nfs export on the qmaster containing the SGE binaries, so the
network doesn't seem to be an issue.

Any ideas?

Thanks again,
Richard.


> -- Reuti
> 
> 
> > configuration" when starting up. Obviously, i've replaced the  
> > hostname and
> > domain with "host.domain" here, to protect our hostnames.
> >
> > Earlier today, nearly 20 of the hosts were reporting this, and the  
> > only way
> > to solve it was to reboot the qmaster machine.
> >
> > Restarting the qmaster daemon, or rebooting the execution hosts or  
> > daemons
> > did nothing.
> >
> > Now i have rebooted the qmaster, all hosts are fixed apart 
> from one  
> > or two.
> >
> > There is absolutely nothing in
> > [b]$SGE_ROOT/default/spool/qmaster/messages[b] at all, apart from  
> > the usual
> > starting up messages.
> >
> > Does anyone know what could be causing this?
> >
> > Thanks in advance :-)
> >
> > Richard.
> >
> > -- 
> > Richard Hobbs (Systems Administrator)
> > Toshiba Research Europe Ltd. - Speech Technology Group
> > Web: http://www.toshiba-europe.com/research/
> > Normal Email: richard.hobbs at crl.toshiba.co.uk
> > Mobile Email: mobile at mongeese.co.uk
> > Tel: +44 1223 376964        Mobile: +44 7811 803377
> >
> >
> >
> > 
> _____________________________________________________________________
> > This e-mail has been scanned for viruses by Verizon Business  
> > Internet Managed Scanning Services - powered by MessageLabs. For  
> > further information visit http://www.mci.com
> >
> > 
> ---------------------------------------------------------------------
> > To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
> > For additional commands, e-mail: users-help at gridengine.sunsource.net
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
> For additional commands, e-mail: users-help at gridengine.sunsource.net
> 
> 
> _____________________________________________________________________
> This e-mail has been scanned for viruses by Verizon Business 
> Internet Managed Scanning Services - powered by MessageLabs. 
> For further information visit http://www.mci.com
> 
> 

-- 
Richard Hobbs (Systems Administrator)
Toshiba Research Europe Ltd. - Speech Technology Group
Web: http://www.toshiba-europe.com/research/
Normal Email: richard.hobbs at crl.toshiba.co.uk
Mobile Email: mobile at mongeese.co.uk
Tel: +44 1223 376964        Mobile: +44 7811 803377 



_____________________________________________________________________
This e-mail has been scanned for viruses by Verizon Business Internet Managed Scanning Services - powered by MessageLabs. For further information visit http://www.mci.com

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
For additional commands, e-mail: users-help at gridengine.sunsource.net




More information about the gridengine-users mailing list