Opened 6 years ago

#1468 new defect

Setting -notify on a job makes grid engine default termination run rather than configured terminate_method

Reported by: wish Owned by:
Priority: normal Milestone:
Component: sge Version: 6.2u3
Severity: minor Keywords:
Cc:

Description

A job with -notify set doesn't invoke a custom terminate method: 07/11/2013 12:38:30 [0:2488]: forward_signal_to_job(): mapping signal 20 TSTP
07/11/2013 12:38:30 [0:2488]: mapped signal TSTP to signal KILL
07/11/2013 12:38:30 [0:2488]: queued signal KILL
07/11/2013 12:38:30 [0:2488]: kill(-2553, USR2) -> notification for delayed (60 s) signal KILL
07/11/2013 12:38:30 [0:2488]: now sending signal USR2 to pid -2553
07/11/2013 12:38:30 [0:2488]: pdc_kill_addgrpid: 20381 12
07/11/2013 12:38:30 [0:2488]: killing pid 2553/6
07/11/2013 12:38:30 [0:2488]: killing pid 32513/6
07/11/2013 12:38:30 [0:2488]: killing pid 32528/6
07/11/2013 12:40:02 [0:2488]: wait3 returned -1
07/11/2013 12:40:02 [0:2488]: forward_signal_to_job(): mapping signal 20 TSTP
07/11/2013 12:40:02 [0:2488]: mapped signal TSTP to signal KILL
07/11/2013 12:40:02 [0:2488]: queued signal KILL
07/11/2013 12:40:02 [0:2488]: ignoring repeated KILL notification

By contrast an otherwise identical job without -notify set converts the kill into an invocation of the terminate_method :
07/11/2013 10:33:40 [0:19045]: forward_signal_to_job(): mapping signal 20 TSTP
07/11/2013 10:33:40 [0:19045]: mapped signal TSTP to signal KILL
07/11/2013 10:33:40 [0:19045]: queued signal KILL
07/11/2013 10:33:40 [0:19045]: /cm/shared/apps/sge/assist/test/bin/terminate_method -> overriddes kill(-19198, KILL)
07/11/2013 10:33:40 [0:31606]: starting terminate_method command: /cm/shared/apps/sge/assist/test/bin/terminate_method
07/11/2013 10:33:40 [1001:31606]: start_as_command: pre_args_ptr[0] = argv0; "/cm/shared/apps/sge/assist/test/bin/terminate_method" shell_path = "/bin/ksh"
07/11/2013 10:33:40 [1001:31606]: execvp(/bin/ksh, "/cm/shared/apps/sge/assist/test/bin/terminate_method" "-c" "/cm/shared/apps/sge/assist/test/bin/terminate_method")
07/11/2013 10:33:40 [1001:31606]: not a GUI job, starting directly
07/11/2013 10:33:40 [0:19045]: wait3 returned 19198 (status: 15; WIFSIGNALED: 1, WIFEXITED: 0, WEXITSTATUS: 0)
07/11/2013 10:33:40 [0:19045]: job exited with exit status 0
07/11/2013 10:33:45 [0:19045]: wait3 returned 31606 (status: 0; WIFSIGNALED: 0, WIFEXITED: 1, WEXITSTATUS: 0)
07/11/2013 10:33:45 [0:19045]: reaped terminate command

Change History (0)

Note: See TracTickets for help on using tickets.