Opened 13 years ago

Last modified 8 years ago

#346 new defect

IZ2037: Checkpointing: No checkpoint created for suspend

Reported by: reuti Owned by:
Priority: normal Milestone:
Component: sge Version: 6.0u7
Severity: Keywords: man
Cc:

Description

[Imported from gridengine issuezilla http://gridengine.sunsource.net/issues/show_bug.cgi?id=2037]

        Issue #:      2037             Platform:     Other    Reporter: reuti (reuti)
       Component:     gridengine          OS:        All
     Subcomponent:    man              Version:      6.0u7       CC:    None defined
        Status:       NEW              Priority:     P3
      Resolution:                     Issue type:    DEFECT
                                   Target milestone: ---
      Assigned to:    andreas (andreas)
      QA Contact:     andreas
          URL:
       * Summary:     Checkpointing: No checkpoint created for suspend
   Status whiteboard:
      Attachments:

     Issue 2037 blocks:
   Votes for issue 2037:


   Opened: Tue Apr 25 04:03:00 -0700 2006 
------------------------


With the transparent and userdefined checkpointing interface, there seems no checkpoint to be created
when you suspend the job or queue (and set "when" of course to x and other events). The documentation
says it should happen (man checkpoint). Also for application level interface, the defined checkpointing
script isn't executed before the migration script.

   ------- Additional comments from andreas Tue Apr 25 04:47:48 -0700 2006 -------
From what I know the ckpt_command is not and shall not be started for
application checkpointing. Instead migr_command is always used.

Thus I blieve this is merely an issue with documentation.

   ------- Additional comments from reuti Tue Apr 25 05:11:51 -0700 2006 -------
For transparent and user-level checkpointing it could be an option anyway, to get a checkpoint as late as
possible, instead of losing some computing time:

ckpt_before_kill   [yes|no]
gracetime_before_kill [<int>]

As the job is just killed, it can't be put in any script. But for application level checkpointing you are right.

Change History (0)

Note: See TracTickets for help on using tickets.