Opened 9 years ago

Last modified 7 years ago

#772 new defect

IZ3230: "qsub" crashes when user information is unknown

Reported by: klaprade Owned by:
Priority: normal Milestone:
Component: sge Version: 6.1u6
Severity: minor Keywords: Sun Solaris clients patch


[Imported from gridengine issuezilla]

        Issue #:      3230             Platform:     Sun       Reporter: klaprade (klaprade)
       Component:     gridengine          OS:        Solaris
     Subcomponent:    clients          Version:      6.1u6        CC:    None defined
        Status:       NEW              Priority:     P3
      Resolution:                     Issue type:    DEFECT
                                   Target milestone: ---
      Assigned to:    roland (roland)
      QA Contact:     roland
       * Summary:     "qsub" crashes when user information is unknown
   Status whiteboard:

     Issue 3230 blocks:
   Votes for issue 3230:

   Opened: Mon Jan 25 15:24:00 -0700 2010 

We are using SGE 6.1u7.

Due to a problem with an LDAP server, "qsub" would occasionally core dump in
"sge_gdi_ctx_setup()" for one of our customers.  A traceback of the core dump
shows a zero pointer is being dereferenced after a call to "getpwuid_r()".
This problem can be reproduced by removing the NIS or LDAP entry for the
current user and issuing a "qsub".  A similar result used to happen with SGE
6.1u2 in "sge_uid2user()".  That problem was caused by the bug described in:

The fix suggested there is in SGE 6.1u7, and, after installing that, "qsub" no
longer core dumps at that location.  However, there are a few more places in
the SGE code that call "getpwuid_r()" without checking the result pointer
before dereferencing it.  So we now get a core dump from "sge_gdi_ctx_setup()".

Perhaps the best fix would be to add a "sge_getpwuid_r()" function that takes
care of retrying NIS lookups similar to "sge_getpwnam_r()".  Then all calls to
"getpwuid_r()" could use that.  Alternatively, additional error checks could
be added like below.  But that would not be as nice as including the retry
logic as in "sge_uid2user()".

--- gridengine/source/libs/gdi/sge_gdi_ctx.c-orig       2009-12-16 11:56:55.600927000 -0500
+++ gridengine/source/libs/gdi/sge_gdi_ctx.c    2010-01-25 17:18:22.418760000 -0500
@@ -695,7 +695,7 @@

       size = get_pw_buffer_size();
       buffer = sge_malloc(size);
-      if (getpwuid_r((uid_t)getuid(), &pwentry, buffer, size, &pwd) == 0) {
+      if (getpwuid_r((uid_t)getuid(), &pwentry, buffer, size, &pwd) == 0 && pwd) {
          es->component_username = sge_strdup(es->component_username, pwd->pw_name);
       } else {

--- gridengine/source/libs/uti/sge_prog.c-orig  2009-12-16 11:58:37.539885000 -0500
+++ gridengine/source/libs/uti/sge_prog.c       2010-01-25 17:17:35.430784000 -0500
@@ -552,7 +552,7 @@

       size = get_pw_buffer_size();
       buffer = sge_malloc(size);
-      SGE_ASSERT(getpwuid_r((uid_t)uti_state_get_uid(), &pwentry, buffer, size, &paswd) == 0)
+      SGE_ASSERT(getpwuid_r((uid_t)uti_state_get_uid(), &pwentry, buffer, size, &paswd) == 0 && paswd)
@@ -879,7 +879,7 @@
       struct passwd pwentry;
       thiz->set_uid(thiz, getuid());
       thiz->set_gid(thiz, getgid());
-      if ((getpwuid_r((uid_t)getuid(), &pwentry, buffer, sizeof(buffer), &paswd)) != 0) {
+      if ((getpwuid_r((uid_t)getuid(), &pwentry, buffer, sizeof(buffer), &paswd)) != 0 || !pw) {
          eh->error(eh, STATUS_EUNKNOWN, ANSWER_QUALITY_ERROR, "getpwuid failed");
          ret = false;
       } else {


Change History (1)

comment:1 Changed 7 years ago by dlove

  • Keywords patch added; removed
  • Severity set to minor
Note: See TracTickets for help on using tickets.