[GE users] sge_qmaster - Error selecting configuration

Andy Schwierskott andy.schwierskott at sun.com
Tue Jul 20 15:24:23 BST 2004


Sam,

> I do not have a "CONFIG:{master hostname}" entry in the Berkeley DB running on
> my master host, and this causes the sge_qmaster daemon not to start. I noticed
> the 'spooldefaults' command in the $SGE_ROOT/utilbin/<arch>/ directory. It
> looks like it can reproduce the default CONFIG entry for my master host with
> the following command:
>
> $SGE_ROOT/utilbin/<arch>/spooldefaults local_conf <template> <name>
>
> Assuming "name" is the master's hostname, what "template" do I use? Does this
> command do what I think it does?
>
> Am I correct in assuming that the master host will have a different CONFIG entry
> in the Berkeley DB, than an execution host's CONFIG?

You need to dump an existing config, which you can use as a template. My
exmaples below was missing a ">" and must read as follows:

   # $SGE_ROOT/utilbin/<arch>/spooledit load CONFIG:<hostname > dump.config

This dumps the local config for host <hostname> to the file "dump.config".
This file has to be edited as I described and should be loaded with

   # $SGE_ROOT/utilbin/<arch>/spooledit load CONFIG:<master_host> dump.config

Where <master_host> is your qmaster hostname.

Qmaster and execd share the same local config!

Andy








>
> Thanks for the help,
> Sam
>
>
> Quoting Andy Schwierskott <andy.schwierskott at sun.com>:
>
>> Sam,
>>
>> I assume you ran into a known bug with hostname resolving. However you need
>> to find a short term solution to be able to restart your master.
>>
>> since you don't have a falt file anymore which you can rename/edit you need
>> to use the new "spooledit" command:
>>
>> Here's the outline:
>>
>>     # $SGE_ROOT/utilbin/<arch></spooledit list CONFIG
>>
>>         -> see all configs, there should be the short name config
>>
>>     # $SGE_ROOT/utilbin/<arch>/spooledit dump CONFIG:<master_short_name> >
>> dump.config
>>
>>     --> now edit the file and change the short name after the seconds line
>>         after "CONF_hname" to the long name and save the file
>>
>>     With
>>
>>     # $SGE_ROOT/utilbin/<arch>/spooledit load CONFIG:<long_name> dump.config
>>
>>     you are loading the new config into BDB.
>>
>> Anyhow: Please answer my questions that we can find out how ths problem
>>           occured.
>>
>> Andy
>>
>>> Sam,
>>>
>>>> I restarted my master host for the first time since installing SGE6.0.
>>>> When the box came back up, sge_qmaster was unable to start. I tried
>>>> starting it by executing the './sgemaster start' command in the
>>>> $SGE_ROOT/default/common/ directory. sgemaster again failed to start and
>>>> then sge_schedd started with an error about not being able to contact
>>>> sge_qmaster. sge_qmaster produced the following messages in the
>>>> $SGE_ROOT/default/spool/qmaster/messages file:
>>>>
>>>> 07/19/2004 09:30:37|qmaster|{the master's hostname}|E|Error selecting
>>>> configuration "{the master's full domain name}"
>>>> 07/19/2004 09:30:37|qmaster|{the master's hostname}|C|setup failed
>>>>
>>>> I'm using Berkeley spooling, and NIS+.
>>>>
>>>> Any idea what might be the cause of this error?
>>>
>>>    - What you do you find in the "act_qmaster" file?
>>>    - whats the content of your "bootstrap" file
>>>    - what's the output of
>>>
>>>            $SGE_ROOT/utilbin/<arch>/gethostname
>>>
>>> Andy
>>>
>>>> As always, thanks for your help.
>>>>
>>>> -Sam
>>>>
>>>> ----------------------------------------------------------------
>>>> This message was sent using IMP, the Internet Messaging Program.
>>>>
>>>>
>>>> ---------------------------------------------------------------------
>>>> To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
>>>> For additional commands, e-mail: users-help at gridengine.sunsource.net

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
For additional commands, e-mail: users-help at gridengine.sunsource.net




More information about the gridengine-users mailing list