[GE issues] [Issue 2844] New - Deleting an tightly integrated job can result in a execd crash

roland roland.dittel at sun.com
Thu Dec 18 16:12:20 GMT 2008


http://gridengine.sunsource.net/issues/show_bug.cgi?id=2844
                 Issue #|2844
                 Summary|Deleting an tightly integrated job can result in a exe
                        |cd crash
               Component|gridengine
                 Version|6.2beta
                Platform|Sun
                     URL|
              OS/Version|All
                  Status|NEW
       Status whiteboard|
                Keywords|
              Resolution|
              Issue type|DEFECT
                Priority|P3
            Subcomponent|execution
             Assigned to|pollinger
             Reported by|roland






------- Additional comments from roland at sunsource.net Thu Dec 18 08:12:20 -0800 2008 -------
After deleting a tightly integrated job with running slave tasks all of my execd
were gone. Running the execution daemon wigh gdb gave this output:

tight integration job crashes execd
12/12/2008 10:35:39 [142302:243]: shepherd called with uid = xxx, euid = xxx
12/12/2008 10:35:39 [142302:244]: shepherd called with uid = xxx, euid = xxx
SIGNAL jid: 33 jatask: 1 signal: KILL
error: commlib error: got read error (closing "xxx/qrsh_ijs/1")
error: commlib error: got read error (closing "xxx/qrsh_ijs/1")
error: commlib error: can't connect to service (Connection refused)
error: commlib error: got select error (Unknown error: 0)
error: commlib error: can't connect to service (Connection refused)
error: can't remove directory "active_jobs/33.1": ====================
recursive_rmdir() failed

Program received signal EXC_BAD_ACCESS, Could not access memory.
Reason: KERN_INVALID_ADDRESS at address: 0xc0080f13
0x000bfbf4 in cull_hash_next ()
(gdb) where
#0  0x000bfbf4 in cull_hash_next ()
#1  0x000182ea in cleanup_job_report ()
#2  0x00023554 in remove_acked_job_exit ()
#3  0x00018a01 in do_ack ()
#4  0x00006035 in sge_execd_process_messages ()
#5  0x00002f32 in main ()

------------------------------------------------------
http://gridengine.sunsource.net/ds/viewMessage.do?dsForumId=36&dsMessageId=93223

To unsubscribe from this discussion, e-mail: [issues-unsubscribe at gridengine.sunsource.net].



More information about the gridengine-users mailing list