Opened 16 years ago
Last modified 10 years ago
#273 new defect
IZ1790: shepherd does not wait for terminate_method to complete
Reported by: | charpold | Owned by: | |
---|---|---|---|
Priority: | normal | Milestone: | |
Component: | sge | Version: | 6.0u3 |
Severity: | Keywords: | Sun execution | |
Cc: |
Description
[Imported from gridengine issuezilla http://gridengine.sunsource.net/issues/show_bug.cgi?id=1790]
Issue #: 1790 Platform: Sun Reporter: charpold (charpold) Component: gridengine OS: All Subcomponent: execution Version: 6.0u3 CC: None defined Status: NEW Priority: P3 Resolution: Issue type: DEFECT Target milestone: --- Assigned to: pollinger (pollinger) QA Contact: pollinger URL: * Summary: shepherd does not wait for terminate_method to complete Status whiteboard: Attachments: Issue 1790 blocks: Votes for issue 1790: 10 Opened: Mon Sep 12 09:14:00 -0700 2005 ------------------------ Attempted to implement a user-specified signal on job termination: We took the following script: iad2mgt02% more user_sig.sh #!/bin/sh PATH=/bin:/usr/bin:/sbin if [ "$SG_NOTIFY_SIGNAL" != "" ] ; then kill -$SG_NOTIFY_SIGNAL -$1 if [ "$SG_NOTIFY_SLEEP_TIME" != "" ] ; then if [ $SG_NOTIFY_SLEEP_TIME -gt 0 ] ; then sleep $SG_NOTIFY_SLEEP_TIME kill -9 -$1 else sleep 10 kill -9 -$1 fi else sleep 10 kill -9 -$1 fi else kill -9 -$1 fi Then, we did a qconf -mq all.q and set terminate_method=/home/sgeadmin/n1ge60/user_sig.sh $job_pid Unfortunately, when we qdel a job, the job immediately gets reported by Grid Engine as completed, and our clean-up process could start at any time, with the result that the job output might be packed into a tarball and sent to the user well before $SG_NOTIFY_SLEEP_TIME has passed, and therefore before any signal-triggered activity has completed. Is there a way to get Grid Engine to leave the process in "dr" state until the terminate_method script has completed? This should be handled in the same way as migration or checkpoint methods. ------- Additional comments from charpold Mon Sep 12 09:16:51 -0700 2005 ------- Fixed misspelling.
Note: See
TracTickets for help on using
tickets.