[GE users] SGE 6.2u2 Install Fails "Admin User Missing" on all hosts

mhanby mhanby at uab.edu
Thu Mar 26 17:11:40 GMT 2009


Yes, I ran the installer as the user root and I verified that with X
forwarding enabled, I was able to ssh to each of the compute nodes
without any errors or garbage echoed to the terminal.

It's strange that the Admin user lookup even failed on the qmaster.

Here is the output for the first command:
$ export ARCH=`$SGE_ROOT/util/arch`; echo $ARCH
$SGE_ROOT/utilbin/$ARCH/gethostbyname -all rocks5-test.eng.uab.edu

lx24-amd64 /share/sge/utilbin/lx24-amd64/gethostbyname -all
rocks5-test.eng.uab.edu

The RunAsAdmin() routine starts on line 26 in my script. Maybe I have an
older copy?

Here is the output on the qmaster
/share/sge/utilbin/lx24-amd64/adminrun sge echo
exit_code=0

And on the compute node
$ ssh compute-0-0 cat /tmp/check_test 
/share/sge/utilbin/lx24-amd64/adminrun sge test -w /share/sge
exit_code=0

Odd thing, this time the user lookup succeeded and both the qmaster and
exec host installed without error.

I realized just now, I hadn't changed the permissions on the /share/sge
from root to sge prior to running the install the first time. I did
chown -R sge:sge /share/sge after the first install, so maybe that had
something to do with it?


-----Original Message-----
From: Lubomir.Petrik at sun.com [mailto:Lubomir.Petrik at sun.com] 
Sent: Thursday, March 26, 2009 11:46 AM
To: users at gridengine.sunsource.net
Subject: Re: [GE users] SGE 6.2u2 Install Fails "Admin User Missing" on
all hosts

Could you sent me the output of following (I don't need the IP)
export ARCH=`$SGE_ROOT/util/arch`; echo $ARCH
$SGE_ROOT/utilbin/$ARCH/gethostbyname -all rocks5-test.eng.uab.edu

I assume you started the installation as root and not as sge, right? 
Still don't understand why it reported missing admin users. If the 
adminrun as sge succeeded, it would not report it.

Could you maybe add change the RunAsAdmin() at line 56 in 
$SGE_ROOT/util/gui-installer/templates/check_host to
RunAsAdmin()
{
   echo "${cfg.sge.root}"/utilbin/"${host.arch}"/adminrun $ADMINUSER $@ 
 > /tmp/check_test
   "${cfg.sge.root}"/utilbin/"${host.arch}"/adminrun $ADMINUSER $@
   EXIT_CODE=$?
   echo "exit_code=$EXIT_CODE" >> /tmp/check_test
}

and try the installer again? No need to finish the installation. Just 
select execd only and add compute-0-0 node and click install. It should 
tell you again that the admin user is missing. Has the /tmp/check_test 
on compute-0-0 expected command and did it succeed?

Thanks,
   Lubos.

mhanby wrote:
> Yes, the "ignore domain names" was selected (the default).
>
> $ qconf -sconfl
> compute-0-0.local
> compute-0-1.local
> compute-0-2.local
> compute-0-3.local
> rocks5-test.eng.uab.edu
>
> In the GUI the qmaster was listed as rocks5-test.eng.uab.edu and
looking
> through the install summary that I saved, the qmaster name is always
the
> fully qualified host name.
>   
That is correct SGE works with it but then ignores the domain when 
comparing hosts.
> -----Original Message-----
> From: Lubomir.Petrik at sun.com [mailto:Lubomir.Petrik at sun.com] 
> Sent: Thursday, March 26, 2009 11:12 AM
> To: users at gridengine.sunsource.net
> Subject: Re: [GE users] SGE 6.2u2 Install Fails "Admin User Missing"
on
> all hosts
>
> mhanby wrote:
>   
>> Interesting, after rearranging the code as shown below, it reveals
>>     
> that
>   
>> my qmaster is not in the act_qmaster file.
>>
>> Looking in the act_qmaster file, it has one entry:
>>
>> rocks5-test.local
>>
>> Now if I run the command that the init.d script uses to get the
>> hostname, it returns the fully qualified name:
>>
>> $utilbin_dir/gethostname -aname
>> rocks5-test.eng.uab.edu
>>   
>>     
> Did you uncheck the ignore domain names during the installation? What 
> was the displayed name in the host table (host selection panel)? The
GUI
>
> installer uses Java to resolve the hosts in order to be fast. I expect

> that rocks5-test.local was displayed in your case and we'll have to
find
>
> out why it differs. Thanks for reporting.
> BTW: Check that qconf -sconfl also lists rocks5-test.local instead of 
> rocks5-test.eng.uab.edu. If you left ignore domain names checked, you 
> have nothing to worry about, since the hosts are treated as the same
> host.
>   
>> Possibly the installer is using a different method to determine the
>>     
> name
>   
>> of the qmaster?
>>
>> Changing the act_qmaster entry to the fully qualified name got
>>     
> sgemaster
>   
>> to start on the qmaster host.
>>
>> Mike
>>
>> -----Original Message-----
>> From: mhanby [mailto:mhanby at uab.edu] 
>> Sent: Thursday, March 26, 2009 10:45 AM
>> To: users at gridengine.sunsource.net
>> Subject: RE: [GE users] SGE 6.2u2 Install Fails "Admin User Missing"
>>     
> on
>   
>> all hosts
>>
>> After install I'm not able to start the qmaster, the init.d script
>> doesn't produce any output.
>> The command you asked me to run returns 0 (after loading the
>>     
> settings.sh
>   
>> to my profile).
>>
>> Looking through the startup routine, there's a block of code that
>> appears to be backwards:
>>   
>>     
> You're right :-)
>
> Lubos.
>
> ------------------------------------------------------
>
http://gridengine.sunsource.net/ds/viewMessage.do?dsForumId=38&dsMessage
> Id=144101
>
> To unsubscribe from this discussion, e-mail:
> [users-unsubscribe at gridengine.sunsource.net].
>
> ------------------------------------------------------
>
http://gridengine.sunsource.net/ds/viewMessage.do?dsForumId=38&dsMessage
Id=144103
>
> To unsubscribe from this discussion, e-mail:
[users-unsubscribe at gridengine.sunsource.net].
>

------------------------------------------------------
http://gridengine.sunsource.net/ds/viewMessage.do?dsForumId=38&dsMessage
Id=144131

To unsubscribe from this discussion, e-mail:
[users-unsubscribe at gridengine.sunsource.net].

------------------------------------------------------
http://gridengine.sunsource.net/ds/viewMessage.do?dsForumId=38&dsMessageId=144153

To unsubscribe from this discussion, e-mail: [users-unsubscribe at gridengine.sunsource.net].



More information about the gridengine-users mailing list