[GE users] SGE on latest Mac OS X Server 10.5.4 - help with non-root users

Ian Levesque ian at crystal.harvard.edu
Tue Jul 8 14:18:55 BST 2008


Hi Chris,

I posted to the list about this problem recently, you should see the  
thread in the archives. I created a bug report on sunsource if you'd  
like to add your observations: http://gridengine.sunsource.net/issues/show_bug.cgi?id=2636

Cheers,
Ian


On Jul 3, 2008, at 5:45 PM, Chris Dagdigian wrote:

> Hi folks,
>
> Skip this message if you don't want to be overwhelmed with SGE debug  
> output ...
>
> I've got a brand new OS X Apple cluster running the 10.5.4 server  
> release that only came out a few days ago.
>
> Right from the beginning I had "can't get password entry for  
> user..." errors so I stripped the system down to the bare essentials:
>
> - No open directory / LDAP
> - No NFS
> - All user accounts local
> - All user accounts using UIDs less than 1024
>
> My test account 'dag' is local and all system commands like 'id',  
> 'finger' and even the OS X command line commands like 'dscl' all  
> resolve the account info perfectly fine. The system search path is  
> correct as well - pointing at /Local/Default and no LDAP servers.
>
> Even in a single-node, no-NFS, no-LDAP environment I still can't get  
> SGE 6.0, 6.1 or 6.2beta2 to function for non-root users.
>
> With courtesy binaries, "qrsh hostname" will hang forever and the  
> qmaster logs will simply show the same old "can't get password entry  
> for user "dag". Either the user does not exist or NIS error!" error.
>
> If I take the SGE 6.1 source code and patch it according to the blog  
> article here:
> http://gridengine.info/articles/2008/03/03/building-6-1u3-on-mac-osx-10-5-2-leopard-server
>
> ... then it still does not work but at least I get the "can't get  
> password" entry error coming to STDOUT instead of hanging the qrsh  
> process.
>
> What is pretty interesting though is if I run "qrsh hostname" with  
> debug mode turned on, using the patched binaries.
>
> It seems that some parts of SGE are able resolve my username and UID  
> just fine and other parts (qrsh starter perhaps) are not able to.
>
> Cutting from the verbose output, this is the interesting bit:
>
>>   163   8332 -1602449504     qlogin_starter sent: 1:can't get  
>> password entry for user "dag". Either the user does not exist or  
>> NIS error!
>>   164   8332 -1602449504     ../clients/qsh/qsh.c 890 1: can't get  
>> password entry for user "dag". Either the user does not exist or  
>> NIS error!
>>
>>   165   8332 -1602449504     sge_set_auth_info: username(uid) =  
>> dag(511), groupname = staff(20)
>
>
> So sge_set_auth_info correctly resolves my non-root user and treats  
> it as if it exists, yet right above that line is the "you don't  
> exist" error message ...
>
>
> I'm going to attach a text file with the full debug output from a  
> "qrsh hostname" command below, I'm hoping someone will have some  
> pointers or insights as to how to keep on troubleshooting this ...
>
> Regards,
> Chris
>
> <sge-error.txt>
>
>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
> For additional commands, e-mail: users-help at gridengine.sunsource.net


---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
For additional commands, e-mail: users-help at gridengine.sunsource.net




More information about the gridengine-users mailing list