[GE users] 6.2 beta 2 on 10.5 Leopard: admin_user does not exist

Chris Dagdigian dag at sonsorol.org
Tue Jul 1 22:15:53 BST 2008


Now I'm in the land of the really bizarre:

I disabled Open Directory and manually recreated the sgeadmin user via  
Workgroup Manager with the same UID and then rebooted the compute nodes.

Jobs submit and run fine on the head node but fail on any compute node:

error reason    1:          07/01/2008 16:10:47 [1026:212]: admin_user  
"sgeadmin" does not exist

So now I'm having non-LDAP issues with SGE on recent versions of OS X  
Server. This was a SGE 6.2beta2 test. Time to go back and see how SGE  
6.0 and 6.1 behave ...

> xxx-host01:~ sgeadmin$ id
> uid=1026(sgeadmin) gid=20(staff) groups=20(staff),80(admin)
> xxx-host01:~ sgeadmin$
> xxx-host01:~ sgeadmin$ finger sgeadmin
> Login: sgeadmin       			Name: sgeadmin
> Directory: /common/home/sgeadmin    	Shell: /bin/bash
> On since Tue Jul  1 16:11 (CDT) on ttys000
>     from xxx-gateway01.managed.xxx.com
> No Mail.
> No Plan.
> xxx-host01:~ sgeadmin$


-Chris





On Jul 1, 2008, at 3:06 PM, Sean Davis wrote:

> On Tue, Jul 1, 2008 at 2:48 PM, Chris Dagdigian <dag at sonsorol.org>  
> wrote:
>>
>> Still testing but it does not look promising:
>>
>> - SGE 6.1u4 and SGE 6.2beta2 can't seem to handle LDAP users on OS X
>
> I'm using 6.2beta2 on three Macs with an LDAP from our debian-based
> server and not having any authentication problems.  It does not appear
> to be a general LDAP/OSX issue, as it works for us.
>
> Sean
>
>> - Building from 6.1 source and patching as described in the  
>> gridengine.info
>> article no longer work
>>
>>
>> Right now I'm experimenting with appending LDAP user info into /etc/ 
>> passwd
>> (failed with 'can't get password entry' so far) and maybe even  
>> creating a
>> fake /etc/shadow -- if that works I can easily make a script that  
>> will
>> scrape the LDAP users and generate the necessary files. All because  
>> I don't
>> want to back out of using Open Directory
>>
>> I'm really missing the 'getent' and nscd tools one usually finds on  
>> Linux
>> systems here.
>>
>> -Chris
>>
>>
>>
>>
>> On Jul 1, 2008, at 2:33 PM, Ian Levesque wrote:
>>
>>> Hi Chris,
>>>
>>> I assume you're getting this error with the released (unpatched)  
>>> 6.1u4? Do
>>> you know if your patches were merged upstream? Is the patched  
>>> 6.1u3 working
>>> reliably? I'm just trying to find a solution for this cluster; at  
>>> this time
>>> I'd be willing to use an older version, so long as it works.
>>>
>>> Cheers,
>>> Ian
>>>
>>>
>>> On Jul 1, 2008, at 1:30 PM, Chris Dagdigian wrote:
>>>
>>>> I just deployed SGE 6.1u4 today on a new Apple cluster and jobs are
>>>> failing with the:
>>>>
>>>>> "can't get password entry for user "cdagdigian". Either the user  
>>>>> does
>>>>> not exist or NIS error!"
>>>>
>>>> .. the interesting thing is that jobs fail for the "cdagdigian"  
>>>> account
>>>> which is local to the systems and they also fail for the  
>>>> "sgeadmin" account
>>>> which is defined in LDAP/OpenDirectory
>>>>
>>>> I'm going to do some more debugging but the fact that jobs fail  
>>>> for both
>>>> local and LDAP accounts is concerning. I may build 6.1u4 from  
>>>> source to see
>>>> if any behavior changes.
>>>>
>>>> -Chris
>>>>
>>>>
>>>>
>>>>
>>>> On Jul 1, 2008, at 1:16 PM, Ian Levesque wrote:
>>>>
>>>>> Hi all -
>>>>>
>>>>> After reading through the list and the helpful notes on  
>>>>> gridengine.info,
>>>>> I saw that there were some caveats regarding the install of GE  
>>>>> 6.1 on a OS X
>>>>> 10.5 cluster using OD for auth [1]. But on wikis.sun.com I saw  
>>>>> that 10.5 is
>>>>> supposedly newly supported in 6.2 [2]. So I downloaded and  
>>>>> installed the
>>>>> 6.2b2 build on sunsource.net. The problem that I'm having is  
>>>>> very similar to
>>>>> the one mentioned on this list back in February for version  
>>>>> 6.1u3. These
>>>>> Intel 10.5 execution nodes are entering an error state (often  
>>>>> after running
>>>>> jobs successfully several times) with qmaster reporting:
>>>>>
>>>>> 07/01/2008 12:45:22|worker|starbuck|W|job 37.1 failed on host
>>>>> gaeta.mcb.harvard.edu general before prolog because: 07/01/2008  
>>>>> 12:45:21
>>>>> [501:33477]: admin_user "admin" does not exist
>>>>>
>>>>> Notice that I even tried configuring GE to use my local "admin"  
>>>>> user
>>>>> when the network account "sgeadmin" exhibited this problem.
>>>>>
>>>>> The question I have is whether the 6.2 betas include the fixes  
>>>>> that
>>>>> Chris posted on his blog, and if not why Sun is claiming support  
>>>>> in 6.2 of
>>>>> OS X 10.5 when it's clearly not ready.
>>>>>
>>>>> Thanks,
>>>>> Ian
>>>>>
>>>>>
>>>>> [1]
>>>>> http://gridengine.info/articles/2008/03/03/building-6-1u3-on-mac-osx-10-5-2-leopard-server
>>>>> [2]
>>>>> http://wikis.sun.com/display/GridEngine/Important+Information+for+Sun+Grid+Engine+6.2
>>>>>
>>>>>
>>>>> * * * *
>>>>> Ian Levesque
>>>>> Research Systems Architect
>>>>> Harvard Medical School
>>>>> Structural Biology Grid
>>>>> http://www.sbgrid.org
>>>>> 617.432.5608
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> ---------------------------------------------------------------------
>>>>> To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
>>>>> For additional commands, e-mail: users-help at gridengine.sunsource.net
>>>>
>>>>
>>>> ---------------------------------------------------------------------
>>>> To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
>>>> For additional commands, e-mail: users- 
>>>> help at gridengine.sunsource.net
>>>
>>>
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
>>> For additional commands, e-mail: users-help at gridengine.sunsource.net
>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
>> For additional commands, e-mail: users-help at gridengine.sunsource.net
>>
>>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
> For additional commands, e-mail: users-help at gridengine.sunsource.net


---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
For additional commands, e-mail: users-help at gridengine.sunsource.net




More information about the gridengine-users mailing list