Opened 15 years ago
Last modified 10 years ago
#322 new defect
IZ1964: qdel failes to delete processes that spawn a new process group with interactive jobs
Reported by: | andreas | Owned by: | |
---|---|---|---|
Priority: | normal | Milestone: | |
Component: | sge | Version: | 5.3 |
Severity: | Keywords: | execution | |
Cc: |
Description
[Imported from gridengine issuezilla http://gridengine.sunsource.net/issues/show_bug.cgi?id=1964]
Issue #: 1964 Platform: All Reporter: andreas (andreas) Component: gridengine OS: All Subcomponent: execution Version: 5.3 CC: None defined Status: NEW Priority: P3 Resolution: Issue type: DEFECT Target milestone: --- Assigned to: pollinger (pollinger) QA Contact: pollinger URL: * Summary: qdel failes to delete processes that spawn a new process group with interactive jobs Status whiteboard: Attachments: Issue 1964 blocks: Votes for issue 1964: Opened: Thu Jan 19 03:59:00 -0700 2006 ------------------------ DESCRIPTION: With interactive jobs/tasks started via qrsh/qsh/qlogin Grid Engine lacks a means to terminate processes which spawned a new process group. A qdel though finishes the job seemingly, but some of the processes remain running: (1) Use qsh to start an xterm(1) under control of Grid Engine (2) Run Grid Engine 'work' example job binary # work -change_pgrp -t 3600 (3) Use qdel to get rid of the job ---> the work process remains running and continues to utilize CPU SUGGESTED FIX: Based on addtl group id job process tracking the problem would require rshd/telnetd/rlogind/xterm/sshd binaries be patched specialy for N1GE to ensure addtl group id gets orderly set. Though a patched rshd is already part of N1GE distribution, but it is not realistic to do the same with telnetd/rlogind/xterm/sshd. Thus the ideal solution appears to change process tracking mechanism in a way that the process tree structre of a job is utilized rather than addtl group id. WORKAROUND: Workaround is one of (1) prevent a new process group be spawned by the job (2) use a pstree based job termination method ------- Additional comments from andreas Thu Jan 19 04:01:54 -0700 2006 ------- There was a related issue 1519. Though a fix for it was delivered, but in case a new pgrp is spawned a pstree based terminate method still is required.
Note: See
TracTickets for help on using
tickets.