[GE users] sge_shepherd segfaults

kisielk kamil at zymeworks.com
Thu Jan 14 18:09:52 GMT 2010


> I'm still trying to get SGE running on OpenSUSE 11.1. I'm using the CVS version to work around the LDAP user problems I was having earlier.
> 
> I now have job submission working for all my users, however whenever a job begins to run it immediately causes sge_shepherd to segfault. 
> 
> From dmesg:
> sge_shepherd[21485]: segfault at 7f7878000000 ip 00007f787a630bc7 sp 00007fff833bc980 error 4 in libc-2.9.so[7f787a5bb000+14f000]
> 
> From the qmaster messages file:
> 01/13/2010 15:42:19|worker|demo|W|job 7.1 failed on host demo.lan.zymeworks.com assumedly after job because: job 7.1 died through signal SEGV (11)
> 
> 
> I tried compiling with -no-opt and -debug but was still unable to get any more information than what's above.
> 
> How can I go about debugging this problem?

After compiling SGE with -no-jemalloc everything appears to work. This problem appears to be related to the use of jemalloc, much like the one I had with nss_ldap.

------------------------------------------------------
http://gridengine.sunsource.net/ds/viewMessage.do?dsForumId=38&dsMessageId=238803

To unsubscribe from this discussion, e-mail: [users-unsubscribe at gridengine.sunsource.net].



More information about the gridengine-users mailing list