Opened 10 years ago

Last modified 9 years ago

#938 new defect

IZ712: MaxPendingJobsSLO reduces the quantity of a need too early

Reported by: rhierlmeier Owned by:
Priority: normal Milestone:
Component: hedeby Version: 1.0u5
Severity: Keywords: Sun gridengine_adapter
Cc:

Description

[Imported from gridengine issuezilla http://gridengine.sunsource.net/issues/show_bug.cgi?id=712]

        Issue #:      712                      Platform:     Sun         Reporter: rhierlmeier (rhierlmeier)
       Component:     hedeby                      OS:        All
     Subcomponent:    gridengine_adapter       Version:      1.0u5          CC:    None defined
        Status:       NEW                      Priority:     P3
      Resolution:                             Issue type:    DEFECT
                                           Target milestone: 1.0u5next
      Assigned to:    rhierlmeier (rhierlmeier)
      QA Contact:     rhierlmeier
          URL:
       * Summary:     MaxPendingJobsSLO reduces the quantity of a need too early
   Status whiteboard:
      Attachments:


     Issue 712 blocks:
   Votes for issue 712:     Vote for this issue


   Opened: Wed Dec 9 03:56:00 -0700 2009 
------------------------


   Description

   MaxPendingJobsSLO counts the number of free slots on the cluster and subtracts
   the number of pending slots to calculate the quantity of the needed resources.

   During the installation of the execd on a host it comes to the point where the
   execd is reported by qstat but the number of free slots is 0 because there are
   no queue instances on the host. It this case MaxPendingJobsSLO assumes already
   that the host has zero free slots. However the installation is not yet finished
   and it should consider in this case the averageSlotsPerHost attribute from the
   SLO configuration as free slot count.

   It comes to the strange situation that the need for the SLO is first
   reduced, suddenly it is raised and finally it is reduced again. The following
   history of a grid engine service illustrate the problem:

   The initial need is 9

   12/09/2009 10:30:18.229 RESOURCE_REQUEST  sge  SLO: maxPendingJobs,
   n=9,urg=13,req='true'

   Three resources are moved to ge service, the execd installation starts. The
   averageSlotsPerHost attribute is considered and the quantity of the need is
   reduced correctly.

   12/09/2009 10:30:19.873 RESOURCE_REQUEST  sge  SLO: maxPendingJobs,
   n=8,urg=13,req='true'
   12/09/2009 10:30:20.745 RESOURCE_REQUEST  sge  SLO: maxPendingJobs,
   n=7,urg=13,req='true'
   12/09/2009 10:30:21.496 RESOURCE_REQUEST  sge  SLO: maxPendingJobs,
   n=6,urg=13,req='true'

   All three execds are now visible with qstat. n should be still 6, however it is
   9

   12/09/2009 10:30:26.517 RESOURCE_REQUEST  sge  SLO: maxPendingJobs,
   n=9,urg=13,req='true'
   12/09/2009 10:30:31.537 RESOURCE_REQUEST  sge  SLO: maxPendingJobs,
   n=9,urg=13,req='true'
   12/09/2009 10:30:36.556 RESOURCE_REQUEST  sge  SLO: maxPendingJobs,
   n=9,urg=13,req='true'

   Now all queue instances on the execds are available. qstat reports enough free
   slots to satisfy MPJSLO

   12/09/2009 10:30:41.575 RESOURCE_REQUEST  sge  SLO: maxPendingJobs, No needs


   The correct behavior of MPJSLO would be:


   12/09/2009 10:30:18.229 RESOURCE_REQUEST  sge  SLO: maxPendingJobs,
   n=9,urg=13,req='true'
   12/09/2009 10:30:19.873 RESOURCE_REQUEST  sge  SLO: maxPendingJobs,
   n=8,urg=13,req='true'
   12/09/2009 10:30:20.745 RESOURCE_REQUEST  sge  SLO: maxPendingJobs,
   n=7,urg=13,req='true'
   12/09/2009 10:30:21.496 RESOURCE_REQUEST  sge  SLO: maxPendingJobs,
   n=6,urg=13,req='true'
   12/09/2009 10:30:41.575 RESOURCE_REQUEST  sge  SLO: maxPendingJobs, No needs

   The problem occurs if an SLO run is made while execd is already available but
   has no queue instances.  Frequent SLO updates (sloUpdateInterval < execd install
   time) raises the chance for hitting the bug.

   Evaluation:

   It can happen that MPJSLO requests too many resources. If the resources comes
   from a cloud service it will produce avoidable costs.

   Analysis:

   MPJSLO should not consider the free slot count from qstat while resource is in
   ASSIGNING state. Once the ASSIGNED state is reached it is guaranteed that the
   execd is correctly installed.


   Workaround:

   Increase the sloUpdateInterval of the ge service

   How to test

   Manual test:

   o Setup a ge service with MJP SLO (max=1, averageJobsPerHost=1) and very small
     SLO update interval (<5s)
   o Disable all queue instances on the cluster
   o Setup a simhost cloud adapter with 10 resources
   o submit a 10 slot job into the cluster
   o Wait until all jobs are processed
   o check the output of the shist command

   Automatic test

   Extend the MPJSLO junit test. Construct the situation that execd is reported by
   qstat but free slot count is 0 while resource is in ASSIGNING state. Assert
   that the quantity of the produced need is calculated correctly.


   ETC: 2PD
               ------- Additional comments from torsten Thu Dec 10 03:48:33 -0700 2009 -------
   The same behavior (MaxPendingJobsSLO requests more resource than necessary,
   behaves too eagerly) can also be observed in the following scenario:

   GE Adapter "sge" configured with MaxPendingJobsSLO (max=1) and FixedUsageSLO,
   containing one resource and no pending jobs. The resource does not go away from
   "sge" because of the FixedUsageSLO.

   Now set the sloUpdateInterval to a short value, like 5sec and wait until the
   jobSuspendPolicy timeout (the "idle" wait time of resources for
   MaxPendingJobsSLO) passed (so normally 2 minutes is enough).

   Now submit one job that could be scheduled on the resource already at sge. Watch
   how MaxPendingJobsSLO requests a new resource, even though there is already a
   free resource available. This happens only if the GE adapter SLO update run is
   done before the scheduler schedules the pending job, so it might take some tries
   until you hit this (increasing the schedule_interval to some big value like 5
   minutes with qconf -msconf helps to hit this reliably).

   The problem in the implementation is that resources which are added to the
   timedOutResourceSet and thus get usage 0 are not considered as resource
   candidates for job scheduling. All ASSIGNING and ASSIGNED resources should
   always be considered as resource candidates.

Change History (0)

Note: See TracTickets for help on using tickets.