[GE users] sge master dying

Rayson Ho rayrayson at gmail.com
Wed Jun 13 18:53:49 BST 2007


    [ The following text is in the "ISO-8859-1" character set. ]
    [ Your display is set for the "ISO-8859-10" character set.  ]
    [ Some special characters may be displayed incorrectly. ]

Use the gdb sub-command "where" to show the stack trace...

Rayson



On 6/13/07, Iwona Sakrejda <isakrejda at lbl.gov> wrote:
> Here is what I see when it crashes while attached to gdb:
> [root at pc2533 debug]# gdb /common/sge/6.0u4/bin/lx24-x86/sge_qmaster 16569
> GNU gdb Red Hat Linux (6.1post-1.20040607.17rh)
> Copyright 2004 Free Software Foundation, Inc.
> GDB is free software, covered by the GNU General Public License, and you are
> welcome to change it and/or distribute copies of it under certain
> conditions.
> Type "show copying" to see the conditions.
> There is absolutely no warranty for GDB.  Type "show warranty" for details.
> This GDB was configured as "i386-redhat-linux-gnu"...Using host
> libthread_db library "/lib/tls/libthread_db.so.1".
>
> Attaching to program: /chos/software/sge/6.0u4/bin/lx24-x86/sge_qmaster,
> process 16569
> Reading symbols from /lib/libdl.so.2...done.
> Loaded symbols for /lib/libdl.so.2
> Reading symbols from /lib/tls/libm.so.6...done.
> Loaded symbols for /lib/tls/libm.so.6
> Reading symbols from /lib/tls/libpthread.so.0...done.
> [Thread debugging using libthread_db enabled]
> [New Thread -1220095328 (LWP 16569)]
> [New Thread -1317291088 (LWP 16704)]
> [New Thread -1306801232 (LWP 16703)]
> [New Thread -1296307280 (LWP 16702)]
> [New Thread -1285555280 (LWP 16701)]
> [New Thread -1265304656 (LWP 16575)]
> [New Thread -1254814800 (LWP 16574)]
> [New Thread -1244324944 (LWP 16573)]
> [New Thread -1233835088 (LWP 16572)]
> [New Thread -1223345232 (LWP 16570)]
> Loaded symbols for /lib/tls/libpthread.so.0
> Reading symbols from /lib/tls/libc.so.6...done.
> Loaded symbols for /lib/tls/libc.so.6
> Reading symbols from /lib/ld-linux.so.2...done.
> Loaded symbols for /lib/ld-linux.so.2
> Reading symbols from /lib/libnss_files.so.2...done.
> Loaded symbols for /lib/libnss_files.so.2
> Reading symbols from
> /chos/software/sge/6.0u4/lib/lx24-x86/libspoolc.so...done.
> Loaded symbols for /software/sge/6.0u4/lib/lx24-x86/libspoolc.so
> Reading symbols from /lib/libnss_dns.so.2...done.
> Loaded symbols for /lib/libnss_dns.so.2
> Reading symbols from /lib/libresolv.so.2...done.
> Loaded symbols for /lib/libresolv.so.2
> 0xb75acd58 in pthread_join () from /lib/tls/libpthread.so.0
> (gdb) cont
> Continuing.
>
> Program received signal SIGBUS, Bus error.
> [Switching to Thread -1317291088 (LWP 16704)]
> 0x0809a007 in hgroup_mod ()
> (gdb) quit
>
>
>
> Rayson Ho wrote:
> > Can you attach qmaster with a debugger, so that we can get the stack
> > trace when it dies??
> >
> > Rayson
> >
> >
> >
> > On 6/13/07, Iwona Sakrejda <isakrejda at lbl.gov> wrote:
> >> Hi,
> >>
> >> I an running SGE 6.0u4 on rhel3 and It's been running ok for a year
> >> or so.
> >> Last week i tried qconf -mhgrp and this command repeatedly kills all the
> >> sge processes on the headnode. I connected with strace to the sgeadmin
> >> before it died and I only see:
> >> rocess 16727 attached - interrupt to quit
> >> futex(0xb03bebf8, FUTEX_WAIT, 16822, NULL) = -1 EINTR (Interrupted
> >> system call)
> >> +++ killed by SIGBUS +++
> >>
> >> Nothing exciting in the logs, it's just going about its bussiness...
> >>
> >> Suggestions on how to approch this problem would be appreciated...
> >>
> >> Thank You,
> >>
> >> iwona
> >>
> >> ---------------------------------------------------------------------
> >> To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
> >> For additional commands, e-mail: users-help at gridengine.sunsource.net
> >>
> >>
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
> > For additional commands, e-mail: users-help at gridengine.sunsource.net
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
> For additional commands, e-mail: users-help at gridengine.sunsource.net
>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
For additional commands, e-mail: users-help at gridengine.sunsource.net




More information about the gridengine-users mailing list