Custom Query (431 matches)
Results (34 - 36 of 431)
Ticket | Resolution | Summary | Owner | Reporter |
---|---|---|---|---|
#158 | fixed | IZ949: When qconf fails during installation, the diagnostic is incorrect | uddeborg | |
Description |
[Imported from gridengine issuezilla http://gridengine.sunsource.net/issues/show_bug.cgi?id=949] Issue #: 949 Platform: All Reporter: uddeborg (uddeborg) Component: gridengine OS: All Subcomponent: install Version: 6.0beta CC: None defined Status: VERIFIED Priority: P3 Resolution: FIXED Issue type: DEFECT Target milestone: --- Assigned to: andy (andy) QA Contact: dom URL: * Summary: When qconf fails during installation, the diagnostic is incorrect Status whiteboard: Attachments: Issue 949 blocks: Votes for issue 949: Opened: Fri Apr 2 07:27:00 -0700 2004 ------------------------ When I tried to install the beta on a host, the procedure failed during the "Checking hostname resolving" phase. After failing, it prints "The error message was:" and then the usage message from qconf. Trying to trace this a little, I came across this code in CheckHostNameResolving() in inst_execd.sh $SGE_BIN/qconf -sh > /dev/null 2>&1 if [ $? = 1 ]; then errmsg=`$SGE_BIN/qconf 2>&1` else errmsg=`$SGE_BIN/qconf -sh 2>&1 | grep denied:` fi Here first qconf is run with the "-sh" flag. Then when it fails, it is run again, in order to capture the error message. But if the exit code was 1, it is run without the -sh flag, which seems like the bug. Ran in this way it does give the usage message, but it is not the error message it got (and discarded) in the first attempt. ------- Additional comments from andy Tue Apr 6 00:58:46 -0700 2004 ------- Fixed. Call "qconf -sh" in case of error. ------- Additional comments from uddeborg Wed Apr 28 04:27:21 -0700 2004 ------- I've now verified the fix in beta 2. (Am I, as a reporter, supposed to do this step? Should I be the one closing too/instead?) |
|||
#160 | fixed | IZ960: Buffer sent to getgrgid_r is too small | uddeborg | |
Description |
[Imported from gridengine issuezilla http://gridengine.sunsource.net/issues/show_bug.cgi?id=960] Issue #: 960 Platform: Sun Reporter: uddeborg (uddeborg) Component: gridengine OS: Solaris Subcomponent: kernel Version: 6.0beta CC: None defined Status: VERIFIED Priority: P3 Resolution: FIXED Issue type: DEFECT Target milestone: --- Assigned to: adoerr (adoerr) QA Contact: andreas URL: * Summary: Buffer sent to getgrgid_r is too small Status whiteboard: Attachments: Issue 960 blocks: Votes for issue 960: Opened: Wed Apr 7 05:18:00 -0700 2004 ------------------------ My attempt to install on Solaris failed. "qconf -sh" returned the error message: error: getgrgid(13) failed: No such file or directory I tried to track this down. It appears to be because the buffer sent to getgrgid_r is too small. In the function sge_gid2group() in source/libs/uti/sge_uidgid.c there is a call of getgrgid_r with a buffer with a size of 2048. This call fails when I run it on our 64 bit Solaris machines. According to the Solaris manual for getgrgid_r, the maximum size which could be needed can be found with the call sysconf(_SC_GETGR_R_SIZE_MAX). I tried this on a couple of platforms I have available here, and got those figures: Sparc, Solaris 8, 32 bit app: 7296 Sparc, Solaris 8, 64 bit app: 10496 PowerPC, AIX 5.2, 32 and 64 bit app: 20023 PARisc, HP-UX 11, 32 and 64 bit app: 2048 AMD64, Red Hat EL 3, 32 and 64 bit app: 1024 IA32, Red Hat EL 3, 32 bit app: 1024 It varies quite a lot, and 2048 obviously is too small in several cases. We have some groups with rather many members, a bit over 100, which probably affects this. But not so many members that an application should break. Preferably, I'd suggest allocating a buffer with a size taken from the return value of sysconf(). Otherwise, I would suggest to at least increase the static size by an order of magnitude. ------- Additional comments from andreas Tue May 4 03:17:28 -0700 2004 ------- There is more than one function where this needs to be changed. ------- Additional comments from adoerr Tue May 11 05:57:48 -0700 2004 ------- Reassign ------- Additional comments from adoerr Sat May 22 07:33:41 -0700 2004 ------- Fixed. ------- Additional comments from uddeborg Thu May 27 02:38:42 -0700 2004 ------- I've rebuilt locally with source/libs/uti/sge_uidgid.c taken from HEAD, and it seems to work fine now. |
|||
#162 | fixed | IZ963: Can't run HPUX binaries | jeffbeadles | |
Description |
[Imported from gridengine issuezilla http://gridengine.sunsource.net/issues/show_bug.cgi?id=963] Issue #: 963 Platform: HP Reporter: jeffbeadles (jeffbeadles) Component: gridengine OS: HP-UX Subcomponent: install Version: 6.0beta CC: None defined Status: VERIFIED Priority: P1 Resolution: FIXED Issue type: DEFECT Target milestone: --- Assigned to: andy (andy) QA Contact: dom URL: * Summary: Can't run HPUX binaries Status whiteboard: Attachments: Issue 963 blocks: Votes for issue 963: Opened: Thu Apr 8 09:39:00 -0700 2004 ------------------------ I mailed this to the dev list, but wanted to make sure it's tracked and will be fixed in the next release. When trying to run install_execd; " ./inst_sge[131]: 13757 Abort /usr/lib/dld.sl: Can't open shared library: /vol2/tools/SW/openssl- 0.9.7c/hp11/lib/libcrypto.sl.0.9.7 /usr/lib/dld.sl: No such file or directory ./inst_sge[57]: 13776 Abort(coredump) Command failed: ./bin/hp11/qconf -sh Probably a permission problem. Please check file access permissions. Check read/write permission. Check if SGE daemons are running. " With a message like that, I know that it's not a permission problem, but rather a missing shared library. So; $ chatr bin/hp11/qconf bin/hp11/qconf: shared executable shared library dynamic path search: SHLIB_PATH disabled second embedded path disabled first Not Defined shared library list: dynamic /vol2/tools/SW/openssl- 0.9.7c/hp11/lib/libcrypto.sl.0.9.7 dynamic /usr/lib/libnsl.1 dynamic /usr/lib/libm.2 dynamic /usr/lib/libpthread.1 dynamic /usr/lib/libc.2 shared library binding: deferred global hash table disabled plabel caching disabled global hash array size:1103 global hash array nbuckets:3 shared vtable support disabled static branch prediction disabled executable from stack: D (default) kernel assisted branch prediction enabled lazy swap allocation disabled text segment locking disabled data segment locking disabled third quadrant private data space disabled fourth quadrant private data space disabled third quadrant global data space disabled data page size: D (default) instruction page size: D (default) nulptr references disabled shared library private mapping disabled shared library text merging disabled The problem is the line that reads; " SHLIB_PATH disabled second" This says to not look in SHLIB_PATH for the shared libraries. To fix this, change the HP makefile/build rules and add a "+s" to the link arguments for all of the executables that need things from $SHLIB_PATH. It can also be modified post-build by running $ chatr +s enable bin/hp11/* (I've verified that this works, at least on HPUX 11.11) Regards, -Jeff ------- Additional comments from andy Mon Apr 19 05:01:01 -0700 2004 ------- Fixed in Beta2. Need feedback from user if it works ------- Additional comments from andy Mon Apr 19 05:02:17 -0700 2004 ------- . ------- Additional comments from jeffbeadles Tue Jul 13 07:31:50 -0700 2004 ------- Works great in the 6.0 "production" release. Thanks Andy! |
Note: See TracQuery
for help on using queries.