Opened 10 years ago

Last modified 9 years ago

#694 new defect

IZ3077: master task of larger parallel job might exceed h_vmem limit

Reported by: pollinger Owned by:
Priority: low Milestone:
Component: sge Version: 6.2
Severity: Keywords: execution
Cc:

Description

[Imported from gridengine issuezilla http://gridengine.sunsource.net/issues/show_bug.cgi?id=3077]

        Issue #:      3077             Platform:     All      Reporter: pollinger (pollinger)
       Component:     gridengine          OS:        All
     Subcomponent:    execution        Version:      6.2         CC:    None defined
        Status:       NEW              Priority:     P4
      Resolution:                     Issue type:    DEFECT
                                   Target milestone: ---
      Assigned to:    pollinger (pollinger)
      QA Contact:     pollinger
          URL:
       * Summary:     master task of larger parallel job might exceed h_vmem limit
   Status whiteboard:
      Attachments:

     Issue 3077 blocks:
   Votes for issue 3077:


   Opened: Tue Jul 7 12:01:00 -0700 2009 
------------------------


The master task of a larger parallel job might exceed h_vmem by just starting dozens or hundreds of qrsh clients. The job is then simply
killed, which is unexpected and annyoing for the user.

One might argue that a job that is restricted to some memory usage just can't start more then a specific number of slave tasks, but then SGE
could handle this situation nicer.

One could also argue that starting the slave tasks is part of Grid Engine, not part of the job, so it shouldn't be counted to the h_vmem
limit of the job.

Change History (0)

Note: See TracTickets for help on using tickets.