Opened 15 years ago

Last modified 9 years ago

#194 new defect

IZ1236: job start error can cause slot debitation inconsistency if qmaster runs out of spool disk space

Reported by: andreas Owned by:
Priority: low Milestone:
Component: sge Version: 6.0
Severity: Keywords: qmaster
Cc:

Description

[Imported from gridengine issuezilla http://gridengine.sunsource.net/issues/show_bug.cgi?id=1236]

        Issue #:      1236             Platform:     All      Reporter: andreas (andreas)
       Component:     gridengine          OS:        All
     Subcomponent:    qmaster          Version:      6.0         CC:    None defined
        Status:       NEW              Priority:     P4
      Resolution:                     Issue type:    DEFECT
                                   Target milestone: ---
      Assigned to:    andreas (andreas)
      QA Contact:     ernst
          URL:        http://gridengine.sunsource.net/servlets/BrowseList?listName=users&by=thread&from=2078
       * Summary:     job start error can cause slot debitation inconsistency if qmaster runs out of spool disk space
   Status whiteboard:
      Attachments:

     Issue 1236 blocks:
   Votes for issue 1236:


   Opened: Wed Aug 18 02:59:00 -0700 2004 
------------------------


DESCRIPTION:
The refered user list mail thread indicates job
start errors can cause inconsistencies with slot
debitation in case qmaster runs out of spool disk
space.

WORKAROUND:
Once that inconsitency occured only restarting
qmaster/scheduler will help to fix slot debitation.
In order to actually prevent that problem to
occure one must ensure qmaster always has
sufficient space available for spooling. The
'max_jobs' setting in sge_conf(5) might help
to prevent qmaster runs out of spool disk space
as a result of too many jobs be submitted to a
cluster.

   ------- Additional comments from andreas Thu Jun 16 08:36:23 -0700 2005 -------
Changing to qmaster.

   ------- Additional comments from ernst Mon Nov 28 00:31:41 -0700 2005 -------
changed summary

   ------- Additional comments from ernst Tue Dec 13 02:26:35 -0700 2005 -------
changed priority

Change History (0)

Note: See TracTickets for help on using tickets.