Opened 11 years ago

Last modified 9 years ago

#570 new enhancement

IZ2718: delete $TMPDIR of a parallel job after the job has finished, not after slave task has finished

Reported by: pollinger Owned by:
Priority: low Milestone:
Component: sge Version: 6.2
Severity: Keywords: execution
Cc:

Description

[Imported from gridengine issuezilla http://gridengine.sunsource.net/issues/show_bug.cgi?id=2718]

        Issue #:      2718             Platform:     All           Reporter: pollinger (pollinger)
       Component:     gridengine          OS:        All
     Subcomponent:    execution        Version:      6.2              CC:
                                                                             [_] reuti
                                                                             [_] Remove selected CCs
        Status:       NEW              Priority:     P4
      Resolution:                     Issue type:    ENHANCEMENT
                                   Target milestone: ---
      Assigned to:    pollinger (pollinger)
      QA Contact:     pollinger
          URL:
       * Summary:     delete $TMPDIR of a parallel job after the job has finished, not after slave task has finished
   Status whiteboard:
      Attachments:

     Issue 2718 blocks:
   Votes for issue 2718:


   Opened: Thu Sep 4 07:36:00 -0700 2008 
------------------------


The $TMPDIR of a tightly integrated parallel job is removed on the slave nodes
directly after the "qrsh -inherit" left the node. This is odd for some
applications, as they need the created scratch files for the next parallel step.
Option for now is to use a global $TMPDIR which is created in a queue prolog and
removed in an epilog. Works, but will generate unnecessary network traffic.

So I was thinking about an option to get the a "delay" in the deletion of
$TMPDIR on the nodes.

- "sge_make_tmpdir" is only called in "daemons/execd/exec_job.c". It could touch
a file "alive" in this created directory (and ignore if it fails because the
directory is already there).

- "sge_remove_tmpdir" is called in "daemons/execd/reaper_execd.c". This one
could fork into daemon-land and wait for around 5 minutes. Is the file "alive"
touched again in this time, just exit. If not: make the final removal of $TMPDIR.

Is this a feasible way, or completely nonsense?

Best option would be of course to delay the final deletion and initiate it from
the master when the main task ends. But it seems that there is no communication
when the main task ends, and the slaves are still active anyway (I remember the
issue having continuing slaves although the master process finished in a proper
way already - the job hung there).

-- Reuti

   ------- Additional comments from pollinger Thu Sep 4 07:38:02 -0700 2008 -------
See also IZ 2358

   ------- Additional comments from reuti Mon Sep 8 03:30:09 -0700 2008 -------
As I had the problem with one application, I circumvent this by now problem by setting up (instead of
recompiling SGE):

a) in start_proc_args I create persistent directories on all nodes:

#
# Create persistent directories for MOLCAS
#

for HOST in `uniq $TMPDIR/machines`; do
    $SGE_ROOT/bin/$ARC/qrsh -inherit $HOST mkdir ${TMPDIR}_persistent
done

b) In stop_proc_args the "qrsh" might no longer be available (i.e. a qdel was uswed). Therefore I submit
a bunch of cleaner jobs.

for HOST in `uniq $TMPDIR/machines`; do
    $SGE_ROOT/bin/$ARC/qsub -o /disk/global/users/$USER/err -e /disk/global/users/$USER/err -l
hostname=$HOST,virtual_free=0,cleaner
 ~soft/scripts/cleaner.sh ${TMPDIR}_persistent
done

c) The cleaner.q is always allowed to run jobs (with a special defined complex) and the called script is:

#!/bin/sh
#
# Remove the persistent directory
#
rm -rf $1
exit 0

d) For the cleaner.q there is an epilog to remove the stdout/-err files:

#!/bin/sh
#
#
# Just remove the standard output- and error-file, if they are empty.
#
[ -r "$SGE_STDOUT_PATH" -a -f "$SGE_STDOUT_PATH" ] && [ ! -s "$SGE_STDOUT_PATH" ] && rm -f
$SGE_STDOUT_PATH
[ -r "$SGE_STDERR_PATH" -a -f "$SGE_STDERR_PATH" ] && [ ! -s "$SGE_STDERR_PATH" ] && rm -f
$SGE_STDERR_PATH
exit 0
===========================

But it would really be nice to specify it directly in SGE, whether the $TMPDIRs on the nodes should be
created/erased only before/after the complete job and not just for each qrsh.

Change History (0)

Note: See TracTickets for help on using tickets.