Opened 11 years ago
Closed 10 years ago
#748 closed defect (duplicate)
IZ3192: double free corruption when getting groupid of ldap users
Reported by: | jdprasad | Owned by: | |
---|---|---|---|
Priority: | high | Milestone: | |
Component: | sge | Version: | 6.2u4 |
Severity: | minor | Keywords: | Linux execution |
Cc: |
Description
[Imported from gridengine issuezilla http://gridengine.sunsource.net/issues/show_bug.cgi?id=3192]
Issue #: 3192 Platform: All Reporter: jdprasad (jdprasad) Component: gridengine OS: Linux Subcomponent: execution Version: 6.2u4 CC: None defined Status: REOPENED Priority: P2 Resolution: Issue type: DEFECT Target milestone: --- Assigned to: pollinger (pollinger) QA Contact: pollinger URL: http://d * Summary: double free corruption when getting groupid of ldap users Status whiteboard: Attachments: Issue 3192 blocks: Votes for issue 3192: Opened: Mon Nov 23 09:32:00 -0700 2009 ------------------------ OS: SuSE Linux Enterprise Server 11 (SLES 11) Scenario: --------- test is an LDAP user and testgr is an LDAP group root@slest-test:> su - test test@sles-test:~> id uid=1003(test) gid=22222(testgr) groups=22222(testgr) test@sles-test:~> qstat *** glibc detected *** qstat: double free or corruption (out): 0x00002aaaaab0e180 *** ======= Backtrace: ========= /lib64/libc.so.6[0x2aaaab3b5118] /lib64/libc.so.6(cfree+0x76)[0x2aaaab3b6c76] /lib64/libnss_ldap.so.2[0x2aaaabcd14fb] /lib64/libnss_ldap.so.2[0x2aaaabcd18f7] /lib64/libnss_ldap.so.2[0x2aaaabccfc22] /lib64/libnss_ldap.so.2(_nss_ldap_getgrgid_r+0x53)[0x2aaaabcd0313] /lib64/libnss_compat.so.2[0x2aaaab8a9b6b] /lib64/libnss_compat.so.2(_nss_compat_getgrgid_r+0xf8)[0x2aaaab8a9d28] /lib64/libc.so.6(getgrgid_r+0xec)[0x2aaaab3e047c] qstat[0x531048] qstat[0x531e8b] qstat[0x469ab6] qstat[0x469eb5] qstat[0x40ae27] /lib64/libc.so.6(__libc_start_main+0xe6)[0x2aaaab35f586] qstat[0x407089] .... ..... ...... test@sles-test:~> gdb `which qstat` core.26708 (gdb) bt #0 0x00002aaaab373645 in raise () from /lib64/libc.so.6 #1 0x00002aaaab374c33 in abort () from /lib64/libc.so.6 #2 0x00002aaaab3af8e8 in ?? () from /lib64/libc.so.6 #3 0x00002aaaab3b5118 in ?? () from /lib64/libc.so.6 #4 0x00002aaaab3b6c76 in free () from /lib64/libc.so.6 #5 0x00002aaaabcd14fb in ?? () from /lib64/libnss_ldap.so.2 #6 0x00002aaaabcd18f7 in ?? () from /lib64/libnss_ldap.so.2 #7 0x00002aaaabccfc22 in ?? () from /lib64/libnss_ldap.so.2 #8 0x00002aaaabcd0313 in _nss_ldap_getgrgid_r () from /lib64/libnss_ldap.so.2 #9 0x00002aaaab8a9b6b in ?? () from /lib64/libnss_compat.so.2 #10 0x00002aaaab8a9d28 in _nss_compat_getgrgid_r () from /lib64/libnss_compat.so.2 #11 0x00002aaaab3e047c in getgrgid_r () from /lib64/libc.so.6 #12 0x0000000000531048 in sge_getgrgid_r () #13 0x0000000000531e8b in sge_gid2group () #14 0x0000000000469ab6 in sge_setup2 () #15 0x0000000000469eb5 in sge_gdi2_setup () #16 0x000000000040ae27 in main () The malloc check can be prevented by setting the env variable: export MALLOC_CHECK_=0 and the command succeeds. The gdb backtrace shows that the crash occurs in the nss_compat lib calls. The following patches to SGE, temporarily solves the issue, but I hope there will be a permanent fix for this. --- sge-6.2u4/gridengine/source/daemons/shepherd/sge_shepherd_ijs.c 2009-07-10 17:59:17.000000000 +0200 +++ sge-6.2u4-new/gridengine/source/daemons/shepherd/sge_shepherd_ijs.c 2009-11-23 13:11:50.000000000 +0100 @@ -747,7 +747,7 @@ THREAD_HANDLE *thread_pty_to_commlib = NULL; THREAD_HANDLE *thread_commlib_to_pty = NULL; cl_raw_list_t *cl_com_log_list = NULL; - + setenv("MALLOC_CHECK_", "0", 1); shepherd_trace("parent: starting parent loop with remote_host = %s, " "remote_port = %d, job_owner = %s, fd_pty_master = %d, " "fd_pipe_in = %d, fd_pipe_out = %d, " ---------------------------------------------------------------------------------------------------- --- sge-6.2u3/gridengine/source/libs/gdi/sge_gdi_ctx.c 2009-01-22 17:03:50.000000000 +0100 +++ sge-6.2u3_new/gridengine/source/libs/gdi/sge_gdi_ctx.c 2009-11-23 11:58:13.000000000 +0100 @@ -1875,6 +1875,7 @@ u_long32 sge_execd_port = 0; bool from_services = false; + setenv("MALLOC_CHECK_", "0", 1); DENTER(TOP_LAYER, "sge_setup2"); if (context == NULL) { @@ -1948,7 +1949,7 @@ { int ret = AE_OK; bool alpp_was_null = true; - + setenv("MALLOC_CHECK_", "0", 1); DENTER(TOP_LAYER, "sge_gdi2_setup"); if (context_ref && sge_gdi_ctx_is_setup(*context_ref)) { @@ -1980,6 +1981,7 @@ sge_prog_state_class_t* prog_state = thiz->get_sge_prog_state(thiz); int ret = CL_RETVAL_OK; + setenv("MALLOC_CHECK_", "0", 1); DENTER(TOP_LAYER, "gdi2_reresolve_qualified_hostname"); ret=getuniquehostname(prog_state->get_qualified_hostname(prog_state), unique_hostname, 0); Probably, this is not the ideal solution. ------- Additional comments from shaas Tue Nov 24 05:43:53 -0700 2009 ------- This is the same problem which is mention in issue 3193 *** This issue has been marked as a duplicate of 3193 *** ------- Additional comments from jdprasad Mon Aug 16 04:34:32 -0700 2010 ------- This problem apparently still exists in sge 6.2u5 OS: SuSE Linux Enterprise Server (SLES11) Kernel: 2.6.27.19-5-default openldap: openldap2-devel-2.4.12-7.16 openldap-db4-4.6.21-47_cm5.1 openldap-servers-2.4.22-47_cm5.1 openldap2-client-2.4.12-7.16 But this time its not a double free or corruption *** glibc detected *** qstat: free(): invalid pointer: 0x00002aaaaab19c00 *** ======= Backtrace: ========= /lib64/libc.so.6[0x2aaaab3b5118] /lib64/libc.so.6(cfree+0x76)[0x2aaaab3b6c76] /usr/lib64/libldap-2.4.so.2(ldap_pvt_get_fqdn+0x6a)[0x2aaaabf1221a] /usr/lib64/libldap-2.4.so.2(ldap_int_initialize+0x57)[0x2aaaabf108b7] /usr/lib64/libldap-2.4.so.2(ldap_create+0x26)[0x2aaaabef6816] /usr/lib64/libldap-2.4.so.2(ldap_initialize+0x2f)[0x2aaaabef6dff] /lib64/libnss_ldap.so.2[0x2aaaabccc327] /lib64/libnss_ldap.so.2[0x2aaaabccf411] /lib64/libnss_ldap.so.2[0x2aaaabccfbcf] /lib64/libnss_ldap.so.2(_nss_ldap_getpwuid_r+0x49)[0x2aaaabcd01b9] /lib64/libnss_compat.so.2[0x2aaaab8aaab8] /lib64/libnss_compat.so.2[0x2aaaab8aacad] /lib64/libnss_compat.so.2(_nss_compat_getpwuid_r+0x100)[0x2aaaab8ab040] /lib64/libc.so.6(getpwuid_r+0xec)[0x2aaaab3e1cfc] qstat[0x54d032] qstat[0x4704b1] qstat[0x47090c] qstat[0x40b9d6] /lib64/libc.so.6(__libc_start_main+0xe6)[0x2aaaab35f586] ... ... gdb `which qstat` core.29665 gdb) bt #0 0x00002aaaab373645 in raise () from /lib64/libc.so.6 #1 0x00002aaaab374c33 in abort () from /lib64/libc.so.6 #2 0x00002aaaab3af8e8 in ?? () from /lib64/libc.so.6 #3 0x00002aaaab3b5118 in ?? () from /lib64/libc.so.6 #4 0x00002aaaab3b6c76 in free () from /lib64/libc.so.6 #5 0x00002aaaabf1221a in ldap_pvt_get_fqdn () from /usr/lib64/libldap-2.4.so.2 #6 0x00002aaaabf108b7 in ldap_int_initialize () from /usr/lib64/libldap-2.4.so.2 #7 0x00002aaaabef6816 in ldap_create () from /usr/lib64/libldap-2.4.so.2 #8 0x00002aaaabef6dff in ldap_initialize () from /usr/lib64/libldap-2.4.so.2 #9 0x00002aaaabccc327 in ?? () from /lib64/libnss_ldap.so.2 #10 0x00002aaaabccf411 in ?? () from /lib64/libnss_ldap.so.2 #11 0x00002aaaabccfbcf in ?? () from /lib64/libnss_ldap.so.2 #12 0x00002aaaabcd01b9 in _nss_ldap_getpwuid_r () from /lib64/libnss_ldap.so.2 #13 0x00002aaaab8aaab8 in ?? () from /lib64/libnss_compat.so.2 #14 0x00002aaaab8aacad in ?? () from /lib64/libnss_compat.so.2 #15 0x00002aaaab8ab040 in _nss_compat_getpwuid_r () from /lib64/libnss_compat.so.2 #16 0x00002aaaab3e1cfc in getpwuid_r () from /lib64/libc.so.6 #17 0x000000000054d032 in sge_uid2user () #18 0x00000000004704b1 in sge_setup2 () #19 0x000000000047090c in sge_gdi2_setup () #20 0x000000000040b9d6 in main ()
Change History (1)
comment:1 Changed 10 years ago by dlove
- Resolution set to duplicate
- Severity set to minor
- Status changed from new to closed
Note: See
TracTickets for help on using
tickets.
Duplicate of IZ3193.