Opened 8 years ago
#1468 new defect
Setting -notify on a job makes grid engine default termination run rather than configured terminate_method
Reported by: | wish | Owned by: | |
---|---|---|---|
Priority: | normal | Milestone: | |
Component: | sge | Version: | 6.2u3 |
Severity: | minor | Keywords: | |
Cc: |
Description
A job with -notify set doesn't invoke a custom terminate method: 07/11/2013 12:38:30 [0:2488]: forward_signal_to_job(): mapping signal 20 TSTP
07/11/2013 12:38:30 [0:2488]: mapped signal TSTP to signal KILL
07/11/2013 12:38:30 [0:2488]: queued signal KILL
07/11/2013 12:38:30 [0:2488]: kill(-2553, USR2) -> notification for delayed (60 s) signal KILL
07/11/2013 12:38:30 [0:2488]: now sending signal USR2 to pid -2553
07/11/2013 12:38:30 [0:2488]: pdc_kill_addgrpid: 20381 12
07/11/2013 12:38:30 [0:2488]: killing pid 2553/6
07/11/2013 12:38:30 [0:2488]: killing pid 32513/6
07/11/2013 12:38:30 [0:2488]: killing pid 32528/6
07/11/2013 12:40:02 [0:2488]: wait3 returned -1
07/11/2013 12:40:02 [0:2488]: forward_signal_to_job(): mapping signal 20 TSTP
07/11/2013 12:40:02 [0:2488]: mapped signal TSTP to signal KILL
07/11/2013 12:40:02 [0:2488]: queued signal KILL
07/11/2013 12:40:02 [0:2488]: ignoring repeated KILL notification
By contrast an otherwise identical job without -notify set converts the kill into an invocation of the terminate_method :
07/11/2013 10:33:40 [0:19045]: forward_signal_to_job(): mapping signal 20 TSTP
07/11/2013 10:33:40 [0:19045]: mapped signal TSTP to signal KILL
07/11/2013 10:33:40 [0:19045]: queued signal KILL
07/11/2013 10:33:40 [0:19045]: /cm/shared/apps/sge/assist/test/bin/terminate_method -> overriddes kill(-19198, KILL)
07/11/2013 10:33:40 [0:31606]: starting terminate_method command: /cm/shared/apps/sge/assist/test/bin/terminate_method
07/11/2013 10:33:40 [1001:31606]: start_as_command: pre_args_ptr[0] = argv0; "/cm/shared/apps/sge/assist/test/bin/terminate_method" shell_path = "/bin/ksh"
07/11/2013 10:33:40 [1001:31606]: execvp(/bin/ksh, "/cm/shared/apps/sge/assist/test/bin/terminate_method" "-c" "/cm/shared/apps/sge/assist/test/bin/terminate_method")
07/11/2013 10:33:40 [1001:31606]: not a GUI job, starting directly
07/11/2013 10:33:40 [0:19045]: wait3 returned 19198 (status: 15; WIFSIGNALED: 1, WIFEXITED: 0, WEXITSTATUS: 0)
07/11/2013 10:33:40 [0:19045]: job exited with exit status 0
07/11/2013 10:33:45 [0:19045]: wait3 returned 31606 (status: 0; WIFSIGNALED: 0, WIFEXITED: 1, WEXITSTATUS: 0)
07/11/2013 10:33:45 [0:19045]: reaped terminate command