[GE users] qdel fails with -notify option

s_kreidl sabine.kreidl at uibk.ac.at
Mon Jun 8 16:44:00 BST 2009


Dear all,

qdel shows some unexpected behavior, when a job was submitted with the -notify option. 

I've observed this reproducible on two SGE versions, namely 6.2u1 and 6.1u4, when submitting the attached example script via
  qsub -q par.q -l s_rt=30 -notify -cwd -j yes -o sleep.out sleep.sh
and afterwards deleting it with qdel as superuser.

The job receives the expected SIGUSR2, but the final SIGKILL is missing, even if waiting for the queue's notification delay (60 seconds, see queue config below).

When doing a forced delete afterwards, the job is deleted from the master, but the script is happily running on on the node and there is no account file entry for the job.

When doing qdel as the submitting user, it's working properly most of the time (SIGUSR2+SIGKILL), but not always.

Thanks in advance for any advice,
Sabine 

P.S.: Here for completeness the queue configuration:

# qconf -sq par.q
qname                 par.q
hostlist              @par_queue
seq_no                0
load_thresholds       np_load_avg=1.10
suspend_thresholds    NONE
nsuspend              1
suspend_interval      00:02:30
priority              0
min_cpu_interval      00:02:30
processors            UNDEFINED
qtype                 BATCH INTERACTIVE
ckpt_list             NONE
pe_list               openmp openmpi-1perhost openmpi-2perhost \
                      openmpi-4perhost openmpi-8perhost openmpi-fillup \
                      openmpi-roundrobin
rerun                 TRUE
slots                 8
tmpdir                /tmp
shell                 /bin/bash
prolog                NONE
epilog                NONE
shell_start_mode      posix_compliant
starter_method        /usr/sge/bin/lx24-amd64/start.sh
suspend_method        SIGTSTP
resume_method         NONE
terminate_method      NONE
notify                00:00:60
owner_list            NONE
user_lists            standard_users power_users
xuser_lists           gr_cb01
subordinate_list      NONE
complex_values        NONE
projects              NONE
xprojects             NONE
calendar              NONE
initial_state         default
s_rt                  240:00:00
h_rt                  336:00:00
s_cpu                 INFINITY
h_cpu                 INFINITY
s_fsize               INFINITY
h_fsize               INFINITY
s_data                INFINITY
h_data                INFINITY
s_stack               INFINITY
h_stack               INFINITY
s_core                INFINITY
h_core                INFINITY
s_rss                 INFINITY
h_rss                 INFINITY
s_vmem                INFINITY
h_vmem                INFINITY

------------------------------------------------------
http://gridengine.sunsource.net/ds/viewMessage.do?dsForumId=38&dsMessageId=201198

To unsubscribe from this discussion, e-mail: [users-unsubscribe at gridengine.sunsource.net].

    [ Part 2, "sleep.sh"  Application/X-SHELLSCRIPT (Name: "sleep.sh") 1 ]
    [ KB. ]
    [ Unable to print this part. ]



More information about the gridengine-users mailing list