Opened 15 years ago

Last modified 9 years ago

#226 new enhancement

IZ1459: Jobs with notify should ignore SIGUSR[12] by default

Reported by: uddeborg Owned by:
Priority: normal Milestone:
Component: sge Version: 6.0u3
Severity: Keywords: execution
Cc:

Description

[Imported from gridengine issuezilla http://gridengine.sunsource.net/issues/show_bug.cgi?id=1459]

        Issue #:      1459             Platform:     All           Reporter: uddeborg (uddeborg)
       Component:     gridengine          OS:        All
     Subcomponent:    execution        Version:      6.0u3            CC:    None defined
        Status:       NEW              Priority:     P3
      Resolution:                     Issue type:    ENHANCEMENT
                                   Target milestone: ---
      Assigned to:    andreas (andreas)
      QA Contact:     pollinger
          URL:        http://gridengine.sunsource.net/servlets/ReadMsg?msgId=24351&listName=users
       * Summary:     Jobs with notify should ignore SIGUSR[12] by default
   Status whiteboard:
      Attachments:

     Issue 1459 blocks:
   Votes for issue 1459:


   Opened: Thu Feb 10 02:54:00 -0700 2005 
------------------------


When started with -notify, the semantics of the
SIGUSR1 flag is that the job soon will be stopped.
 The default action for this signal is to die, but
that isn't really appropriate in the notify
context.  A better default action would be to
ignore the signal, and then let processes that
know what to do handle them appropriately.

I suggest shepherd should change the action of
SIGUSR1 to ISG_IGN before exec()ing the job.

The semantics of the SIGUSR2 signal is that the
job will be killed.  The default action to die
does thus seem more reasonable in this case.  But
as discussed in the referenced thread in the
grid-users mailing list, there some ways to invoke
jobs where you don't access the top level process
in the job hierarchy.  Processes lower down in the
hierarchy will then be killed before they get a
chance to do any cleanup, even if they catch the
SIGUSR2.  There are also some time windows during
startup of a job.

Therefore, I suggest the shepherd changes the
action of SIGUSR2 too to SIG_IGN.

I would think a logical place to do this would be
in the son() function in builtin_starter.c, near
the calls of sge_set_def_sig_mask and
sge_unblock_all_signals.

   ------- Additional comments from sgrell Tue Dec 6 08:16:53 -0700 2005 -------
Changed the Subcomponent.

Stephan

Change History (0)

Note: See TracTickets for help on using tickets.