[GE issues] [Issue 3194] sge_shepherd segfault on OpenSuSE 11.2 (x86_64)

megware stephan.ebelt at megware.com
Fri Nov 27 10:56:47 GMT 2009


User megware changed the following:

                What    |Old value                 |New value
                  Status|RESOLVED                  |REOPENED
              Resolution|WORKSFORME                |

------- Additional comments from megware at sunsource.net Fri Nov 27 02:56:46 -0800 2009 -------
I modified /etc/init.d/sgeexecd to set MALLOC_CHECK_ right before sgeexecd is invoked and tried with two approaches:

      export MALLOC_CHECK_=0


      MALLOC_CHECK_=0 $bin_dir/sge_execd

neither of which caused a difference in behaviour, ie. the job still dies and there is SIGSEGV reported in qmaster log and a segfault line
in dmesg on the node.

Interestingly I so far failed to reproduce this on a virtual machine installation here. The main difference to the cluster is that there's a
more complex setup involving NIS and automount for user homes (I can't really simulate this). Could that be related?



To unsubscribe from this discussion, e-mail: [issues-unsubscribe at gridengine.sunsource.net].

More information about the gridengine-users mailing list