[GE users] Troubleshooting NIS errors (SGE 6.1u3 / Linux)

Mulley, Nikhil Nikhil.Mulley at deshaw.com
Fri Jan 11 07:27:13 GMT 2008


What does hosts/services have in nsswitch.conf ?

Like this: ?
hosts:      files nis dns 
services:   files nis

-----Original Message-----
From: Chris Dagdigian [mailto:dag at sonsorol.org] 
Sent: Friday, January 11, 2008 5:33 AM
To: users at gridengine.sunsource.net
Subject: Re: [GE users] Troubleshooting NIS errors (SGE 6.1u3 / Linux)


Ken, Chansup, Reuti -- thanks for all your help

We are still confused but found a workaround. All of your help/tips/ 
suggestions greatly helped us doing our local sanity checks.

This is a short list of all our tests/weaks that we did without success:

  - Altering the order of "files" and "nis" in /etc/nssswitch has no  
effect (cluster-wide)
  - Firewall on/off has no effect (cluster-wide)
  - NIS users are only in a few groups (less than 4)
  - All ypcat programs work as one would expect (cluster-wide)
  - Adding and removing "+::::::" to local auth files has no effect  
(cluster-wide)
  - All SGE utilbin test programs like "checkuser" and "uidgid" work  
as expected for NIS users (cluster-wide)
  - Linux 'id' program works properly for NIS users (cluster-wide)
  - Linux 'getent' program can query authinfo for NIS users (cluster- 
wide)
  - Uncommenting a few lines in ypserv.conf concerning shadow-like  
passwords will segfault qsub and qrsh (2nd hand info)

Our workaround came from this list message:

https://lists.sdsc.edu/pipermail/npaci-rocks-discussion/2007-August/0266
84.html

We are basically using "getent" now to query the passwd info from NIS  
and build local /etc/passwd|shadow|group files that represent local  
and NIS users. Very similar to things I've done in the past involving  
replicating these files out to cluster nodes after creating a user on  
the master system.

The basic situation remains:

  - Every test we run on Linux querying the state/status of NIS works  
perfectly fine - we have run out of things to test

  - Any SGE action involving a NIS user fails with a "can't get  
password" type error

With our gettent hack we are past the problem. This is still  
interesting (and confusing!) so I may try to replicate on one of our  
spare clusters on our own time.

Thanks again for the quick replies!

-Chris






On Jan 10, 2008, at 6:40 PM, Reuti wrote:

> Am 10.01.2008 um 21:34 schrieb Chris Dagdigian:
>
>>
>> On Jan 10, 2008, at 3:22 PM, Ken Tang wrote:
>>
>>> What does /etc/nsswitch show?  Is NIS first in the list for  
>>> passwd, shadow, and groups? Any firewalls up and running?  Is the  
>>> NIS service running?  I believe it is ypbind
>>
>> Thanks Ken,
>>
>> NIS is 2nd in /etc/nssswitch:
>>
>>> passwd: files nis
>>> shadow: files nis
>>> group: files nis
>
> The last line in passwd on the nodes is:
>
> +::::::
>
> ?
>
> -- Reuti
>

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
For additional commands, e-mail: users-help at gridengine.sunsource.net

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
For additional commands, e-mail: users-help at gridengine.sunsource.net




More information about the gridengine-users mailing list