Opened 15 years ago
Last modified 10 years ago
#319 new defect
IZ1960: h_cpu not working for Tight Integrated jobs
Reported by: | reuti | Owned by: | |
---|---|---|---|
Priority: | normal | Milestone: | |
Component: | sge | Version: | 6.0u7 |
Severity: | Keywords: | Linux qmaster | |
Cc: |
Description
[Imported from gridengine issuezilla http://gridengine.sunsource.net/issues/show_bug.cgi?id=1960]
Issue #: 1960 Platform: Other Reporter: reuti (reuti) Component: gridengine OS: Linux Subcomponent: qmaster Version: 6.0u7 CC: None defined Status: NEW Priority: P3 Resolution: Issue type: DEFECT Target milestone: --- Assigned to: ernst (ernst) QA Contact: ernst URL: * Summary: h_cpu not working for Tight Integrated jobs Status whiteboard: Attachments: Issue 1960 blocks: Votes for issue 1960: Opened: Sat Jan 14 08:24:00 -0700 2006 ------------------------ Submitting an endless loop in MPICH which generates loads on the selected nodes: $ qsub -l h_cpu=60 -pe mpich 2 test.sh $ qstat -u reuti -g t 7358 0.54333 test.sh ... para@node39 MASTER 7358 0.54333 test.sh ... para@node41 SLAVE On node41 hence: 20300 ? S 0:00 \_ sge_shepherd-7358 -bg 20301 ? Ss 0:00 | \_ /usr/sge/utilbin/lx24-x86/rshd -l 20302 ? S 0:00 | \_ /usr/sge/utilbin/lx24-x86/qrsh_starter /var/spool/sge/node41/ active_jobs/7358.1/1.node41 20303 ? R 0:22 | \_ /home/reuti/mpihello node39 35362 4amslave -p4yourname node41 -p4rmrank 1 20304 ? S 0:00 | \_ /home/reuti/mpihello node39 35362 4amslave -p4yourname node41 -p4rmrank 1 fine, and after 60 seconds: 20300 ? S 0:00 \_ sge_shepherd-7358 -bg 20301 ? Ss 0:00 | \_ /usr/sge/utilbin/lx24-x86/rshd -l 20302 ? Z 0:00 | \_ [qrsh_starter] <defunct> 20304 ? S 0:00 /home/reuti/mpihello node39 35362 4amslave -p4yourname node41 - p4rmrank 1 and nothing more changes. Only the process 20303 generated load, and now SGE still keeps this job, as it never realises, that the h_cpu limit of the kernel-setrlimit was reached by one process. On the head node of the parallel job, the job script already exited and isn't any longer in the process tree. So the desired behavior could be to kill all slave tasks, if the main script already finished. In some way, this might be related to: http://gridengine.sunsource.net/issues/show_bug.cgi?id=1681 ------- Additional comments from reuti Sun Jan 15 04:37:16 -0700 2006 ------- On the head node of the parallel job, the PE stop script is executed as expected. This way any parallel lib has a chance to shutdown any daemons in a proper way. After executing this stop script, SGE seems waiting forever. In this stage all the qrsh remainings should be shut down on the slave nodes. ------- Additional comments from reuti Mon Jan 16 00:35:10 -0700 2006 ------- Maybe this can be extended to an enhancement: with a tight integrated parallel job which uses daemons like LAM/MPI, a qdel will stop the qrsh processes *before* the PE stop script is executed. If this could be delayed to happen *after* the PE stop stcript, then there would be a chance to shutdown the daemons in a proper way and get rid of semaphores and shared memory segments. If SGE is killing the qrsh processes under all circumstances after the PE stop script (if there are any left), then the kill before the PE stop script could be an option to be defined in the PE setup "kill_before_stop_proc_args TRUE/FALSE" if the current behavior is needed.
Note: See
TracTickets for help on using
tickets.