Ticket #1483: 0001-Prevent-cgroup-cpuset-code-from-killing-shepherd-at-.patch

File 0001-Prevent-cgroup-cpuset-code-from-killing-shepherd-at-.patch, 3.3 KB (added by markdixon, 6 years ago)

Added by email2trac

  • source/libs/uti2/sge_cgroup.c

    From fb492a63fad32fbb210e07f4bb71652bfe001895 Mon Sep 17 00:00:00 2001
    From: Mark Dixon <m.c.dixon@leeds.ac.uk>
    Date: Fri, 11 Oct 2013 19:12:57 +0100
    Subject: [PATCH] Prevent cgroup/cpuset code from killing shepherd at job end
    
    When the execd_params option USE_CGROUPS is enabled, the cgroup/cpuset
    cleanup code checks for and kills processes related to the job. This
    includes the shepherd, triggering the job cleanup signal handler. However,
    as the execd also kills the shepherd elsewhere, this can cause the job
    cleanup code to be traversed twice as many times as usual. This patch
    causes the cgroup/cpuset cleanup to skip the shepherd.
    
    This has been seen to be a problem when the node running the job master
    qrsh's back into itself. In that case, the most obvious symptoms are:
    
    * Messages of the following form in the execd logs:
    
      10/14/2013 12:15:23|  main|h3s0b1|W|rogue process(es) found for task 1353.1
      10/14/2013 12:15:23|  main|h3s0b1|E|shepherd of job 1353.1 died through signal = 9
      10/14/2013 12:15:23|  main|h3s0b1|E|abnormal termination of shepherd for job 1353.1: "exit_status" file is empty
      10/14/2013 12:15:23|  main|h3s0b1|E|can't open usage file "active_jobs/1353.1/usage" for job 1353.1: No such file or directory
      10/14/2013 12:15:23|  main|h3s0b1|E|shepherd exited with exit status 19: before writing exit_status
    
    * A job failure email sent to adminmail
    
    * The job start_time / end_time entries in the accounting file are 0
    (interpreted as -/- in qacct)
    ---
     source/libs/uti2/sge_cgroup.c |   23 ++++++++++++++++-------
     1 files changed, 16 insertions(+), 7 deletions(-)
    
    diff --git a/source/libs/uti2/sge_cgroup.c b/source/libs/uti2/sge_cgroup.c
    index 66e0d5c..8d9a430 100644
    a b remove_shepherd_cpuset(u_long32 job, u_long32 task, pid_t pid) 
    456456      char buf[MAX_STRING_SIZE], cfile[SGE_PATH_MAX], *cmd;
    457457      size_t l = sizeof buf;
    458458
    459       if (!rogue)
    460          WARNING((SGE_EVENT, "rogue process(es) found for task "
    461                   sge_u32"."sge_u32, job, task));
    462       rogue = true;
     459      /* Terminate string and extract process name */
    463460      replace_char(spid, strlen(spid), '\n', '\0');
    464461      snprintf(cfile, sizeof cfile, "/proc/%s/cmdline", spid);
    465462      errno = 0;
    466463      cmd = dev_file2string(cfile, buf, &l);
    467       if (l) INFO((SGE_EVENT, "rogue: "SFN2, replace_char(cmd, l, '\0', ' ')));
     464
    468465      /* Move the task away to avoid waiting for it to die.  */
    469466      /* Fixme:  Keep the cpusetdir tasks open and just write to that.  */
    470467      reparent_proc(spid, cgroup_dir(cg_cpuset));
    471       pid = atoi(spid);
    472       if (pid) kill(pid, SIGKILL);
     468      pid_t rpid = atoi(spid);
     469
     470      /* Kill rogue process (unless it's the shepherd)        */
     471      /* Shepherd needs to be killed exactly once, otherwise  */
     472      /* sge_reap_children_execd is called multiple times     */
     473      if (rpid && rpid != pid) {
     474          if (!rogue)
     475             WARNING((SGE_EVENT, "rogue process(es) found for task "
     476                  sge_u32"."sge_u32, job, task));
     477          rogue = true;
     478          if (l) INFO((SGE_EVENT, "rogue: "SFN2, replace_char(cmd, l, '\0', ' ')));
     479
     480          kill(rpid, SIGKILL);
     481      }
    473482   }
    474483   fclose(fp);
    475484   errno = 0;