Opened 4 years ago
#1607 new defect
Do not ignore SIGCHLD
Reported by: | opoplawski | Owned by: | |
---|---|---|---|
Priority: | normal | Milestone: | |
Component: | sge | Version: | 8.1.9 |
Severity: | minor | Keywords: | |
Cc: |
Description
While testing out the credential handling by sge_qmaster, I found this:
04/06/2017 11:28:37|worker|vulcan7|E|could not store credentials for job 15 - command "/usr/share/gridengine/utilbin/lx-amd64/put_cred" failed with return code 10
This because sge_qmaster is ignoring SIGCHLD and setting SA_NOCLDWAIT, and thus waitpid() is returning with errno 10 - ECHILD because the child has already exited and we said we didn't care.
This appears to date back quite a ways:
commit fd6c976608cbde90d95cfb6a04eaee793a60ce68 Author: adoerr <adoerr> Date: Wed Nov 3 10:53:39 2004 +0000 *** empty log message *** diff --git a/Changelog b/Changelog index 482d358..57c8ee0 100644 --- a/Changelog +++ b/Changelog @@ -1,3 +1,9 @@ +AD-2004-11-03-0: Bugfix: '-m a' qsub option did leave a zombie process + Review: EB + Changed: qmaster + Issue: 1277 + Bugtraq: 5104789 +
but this completely breaks sge_peopen()/sge_peclose functionality. Perhaps some mailing code will need to add the necessary waitpid() call.
Attachments (1)
Note: See
TracTickets for help on using
tickets.
Patch to not ignore SIGCHLD