Opened 12 years ago

Last modified 9 years ago

#1061 new defect

IZ120: test queue settings not appropriate for running tight_integration_massive() on small clusters

Reported by: guenter_herbert Owned by:
Priority: normal Milestone:
Component: testsuite Version: current
Severity: Keywords: Sun tests


[Imported from gridengine issuezilla]

        Issue #:      120             Platform:     Sun           Reporter: guenter_herbert (guenter_herbert)
       Component:     testsuite          OS:        All
     Subcomponent:    tests           Version:      current          CC:    None defined
        Status:       NEW             Priority:     P3
      Resolution:                    Issue type:    DEFECT
                                  Target milestone: milestone 1
      Assigned to:    guenter_herbert (guenter_herbert)
      QA Contact:     joga
       * Summary:     test queue settings not appropriate for running tight_integration_massive() on small clusters
   Status whiteboard:

     Issue 120 blocks:
   Votes for issue 120:     Vote for this issue

   Opened: Fri Feb 16 06:37:00 -0700 2007 

The test strategy of tight_integration_massive() is to ramp up the load (i.e the
number of tasks) up to a hard coded value (currently 200) in increments of 10,
starting with 19 slaves and one master job. Test criterion
is a successfull creation and completion of the requested number of tasks.

This tests fails on a regular basis for the RPE testsuite in Frankfurt. The root
cause is a false assumption that a test queue (tight.q) with 10 slots will
suffice to run this test. The current implementation requires 20 execd's to
accomplish this. This is probably OK with the RGB test site, but the Frankfurt
grid currently comprises just 2 execd's which yields a total of 20 slots. Which
is not enough for a successfull execution of this test.

On the other hand it does not make too much sense to relate a functional test to
a collection of hardware resources. Yes, a two-node cluster will (and does!)
perform not sooo fast, but it does the job. And this is what functional testing
is about.

Hence I've modified the failing test in such a way that the number of slots is
given by the ratio between TIGHT_INTEGRATION_MASSIVE_SLOTS and the number of
exed's. In this way the test is now decoupled from any h/w resources and can run
on any sized cluster.

As a side effect of the increased load on the target machines, the queue needs
to get defined without any load_threshold attributes. There are no adverse
effects observed with that modified test

Change History (0)

Note: See TracTickets for help on using tickets.