[GE users] 6.0u4 qmaster crashing

Richard Hierlmeier Richard.Hierlmeier at Sun.COM
Fri Dec 16 14:42:59 GMT 2005


    [ The following text is in the "ISO-8859-1" character set. ]
    [ Your display is set for the "ISO-8859-10" character set.  ]
    [ Some special characters may be displayed incorrectly. ]

Mike Brown wrote:
> I'm wondering the best way to debug the qmaster.  It seems that when I 
> try to set the debug level, only the qmaster (but not schedd) starts.
> Are there any other suggestions besides reading this output?  I'm 
> running 6.0u4 and noticing the qmaster dying every few days.  Nothing 
> serious appears in the schedd or qmaster output.  I've upgraded to 6.0u7 
> in case that will fix anything, but may still need to debug.

o Have you tried to set the debug level with the util/dl.csh or
   util/dl.sh script.  This script sets the SGE_DEBUG_LEVEL variable.
   If it is set the qmaster do not daemonize, depending on the debug
   level alot of information is printed to stdout.
Example:

# qconf -km
# source util/dl.csh 1
# dl 1
# echo $SGE_DEBUG_LEVEL
2 0 0 0 0 0 0 0
# default/common/sgeqmaster
    starting sge_qmaster
      0  30641 16384     ****** starting localization procedure ... 
**********
      1  30641 16384     could not get environment variable "GRIDPACKAGE"
      2  30641 16384     could not get environment variable "GRIDLOCALEDIR"
      3  30641 16384     environment LANGUAGE or LANG is not set; no 
language selected - using defaults
      4  30641 16384     setlocale() returns "C"
      5  30641 16384     locale directory: >/tools/testsuite/sge/locale<
      6  30641 16384     package file:     >lx26-x86/gridengine.mo<
      7  30641 16384     language (LANG):  >C<
...


o Do you have a core file?

If you have a core file you can use a debugger like gdb to find out in 
what function the qmaster terminates:

# gdb $SGE_ROOT/bin/lx24-x86/sge_qmaster core
(gdb) where
.. stacktrace will be printed.

You can send us the stacktrace for futher diagnostic.

Richard


> Thanks!
> 
> Mike
> 
> 
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
> For additional commands, e-mail: users-help at gridengine.sunsource.net
> 


-- 
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
Richard Hierlmeier           Phone: ++49 (0)941 3075-223
Software Engineering         Fax:   ++49 (0)941 3075-222
Sun Microsystems GmbH
Dr.-Leo-Ritter-Str. 7         mailto: richard.hierlmeier at sun.com
D-93049 Regensburg           http://www.sun.com/grid

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
For additional commands, e-mail: users-help at gridengine.sunsource.net




More information about the gridengine-users mailing list