[GE users] DRMAA Segmentation Fault

crei crei at sun.com
Tue Jul 14 15:04:14 BST 2009


Hi,

ok - in my modified source code base I can see a "TODO" which talks about
wrong locking order in the function cl_commlib_handle_connection_read()
but this would normally cause a deadlock. I must have entered the "TODO"
when digging around in the code ;-)

If it is in the startup phase it might also be a problem with unlocked code
when commlib is starting up its threads.

Q1: Is it possible to get also the debug info of the other commlib threads?

I will work on this commlib "TODO", but it is currently not No. 1 in my
list since there was no problem reported until now ...

Q2: Is this DRMAA application crashing all the time or only "sporadic"?

Regards,

Christian


On 07/14/09 15:23, templedf wrote:
> IZ2882 is a fix for a different issue.  Someone suggested that perhaps 
> the problem was that the init and exit were out of order, which is what 
> IZ2882 fixed, but Zhizhong said that was not the problem in this case.
> 
> Daniel
> 
> crei wrote:
>> Hi,
>>
>> Is this still a problem or not - Daniel wrote that issue is fixed with 62u2:
>>
>> snip
>> ...
>> Which version do you use? The bug was already fixed for 62u2.
>> It is known as IZ 2882:
>> http://gridengine.sunsource.net/issues/show_bug.cgi?id=2882
>> I will have a look if your other examples are still valid.
>> Thanks for reporting it!
>> ...
>> snap
>>
>> What's the state here?
>>
>> Reisi
>>
>>
>> On 07/13/09 18:03, templedf wrote:
>>   
>>> Looks like a commlib or event client issue.  Reisi, does this look 
>>> familiar to you?  It's happening when the DRMAA client tries to register 
>>> with the master as an event client.
>>>
>>> Daniel
>>>
>>> zliu wrote:
>>>     
>>>> Hi,
>>>>
>>>> We have a Java application that uses DRMAA-Java. Our SGE version is 6.2u1. Recently our application crashed several times without obvious cause. Looking into core dump we found:
>>>>
>>>> Program terminated with signal 11, Segmentation fault.
>>>>
>>>> #0  0x0000002ba3c4fd80 in cl_message_list_get_first_elem () from /ifshome/sge6.2u1/lib/lx24-amd64/libdrmaa.so.1.0
>>>> #1  0x0000002ba3c60120 in cl_commlib_app_message_queue_cleanup () from /ifshome/sge6.2u1/lib/lx24-amd64/libdrmaa.so.1.0
>>>> #2  0x0000002ba3c5e724 in cl_com_handle_service_thread () from /ifshome/sge6.2u1/lib/lx24-amd64/libdrmaa.so.1.0
>>>> #3  0x0000003321a06137 in start_thread () from /lib64/tls/libpthread.so.0
>>>> #4  0x00000033211c9883 in clone () from /lib64/tls/libc.so.6
>>>>
>>>> Can anyone help?
>>>>
>>>> Thanks,
>>>> Zhizhong
>>>>
>>>> ------------------------------------------------------
>>>> http://gridengine.sunsource.net/ds/viewMessage.do?dsForumId=38&dsMessageId=206042
>>>>
>>>> To unsubscribe from this discussion, e-mail: [users-unsubscribe at gridengine.sunsource.net].
>>>>
>>>>       
>>> ------------------------------------------------------
>>> http://gridengine.sunsource.net/ds/viewMessage.do?dsForumId=38&dsMessageId=206863
>>>
>>> To unsubscribe from this discussion, e-mail: [users-unsubscribe at gridengine.sunsource.net].
>>>     
>>
> 
> ------------------------------------------------------
> http://gridengine.sunsource.net/ds/viewMessage.do?dsForumId=38&dsMessageId=207118
> 
> To unsubscribe from this discussion, e-mail: [users-unsubscribe at gridengine.sunsource.net].

-- 
Sun Microsystems GmbH             Christian Reissmann
Dr.-Leo-Ritter-Str. 7             Software Engineer
D-93049 Regensburg                Phone: +49 (0)941 3075 112
Germany                           Fax:   +49 (0)941 3075 222
http://www.sun.de                 mailto: Christian.Reissmann at sun.com
                                   http://www.sun.com/gridengine
Sitz der Gesellschaft:
Sun Microsystems GmbH, Sonnenallee 1, D-85551 Kirchheim-Heimstetten
Amtsgericht Muenchen: HRB 161028
Geschaeftsfuehrer: Thomas Schroeder, Wolfgang Engels, Wolf Frenkel
Vorsitzender des Aufsichtsrates: Martin Haering

------------------------------------------------------
http://gridengine.sunsource.net/ds/viewMessage.do?dsForumId=38&dsMessageId=207134

To unsubscribe from this discussion, e-mail: [users-unsubscribe at gridengine.sunsource.net].



More information about the gridengine-users mailing list