Opened 10 years ago

Last modified 9 years ago

#787 new defect

IZ3249: PE mismatch between AR and the submitted job using it results in wrong allocation

Reported by: reuti Owned by:
Priority: normal Milestone:
Component: sge Version: 6.2u5
Severity: Keywords: Macintosh scheduling
Cc:

Description

[Imported from gridengine issuezilla http://gridengine.sunsource.net/issues/show_bug.cgi?id=3249]

        Issue #:      3249             Platform:     Macintosh   Reporter: reuti (reuti)
       Component:     gridengine          OS:        All
     Subcomponent:    scheduling       Version:      6.2u5          CC:    None defined
        Status:       NEW              Priority:     P3
      Resolution:                     Issue type:    DEFECT
                                   Target milestone: ---
      Assigned to:    andreas (andreas)
      QA Contact:     andreas
          URL:
       * Summary:     PE mismatch between AR and the submitted job using it results in wrong allocation
   Status whiteboard:
      Attachments:

     Issue 3249 blocks:
   Votes for issue 3249:


   Opened: Fri Mar 12 13:59:00 -0700 2010 
------------------------


$ qrsub -pe mpich 4 -d 3600
Your advance reservation 75 has been granted
$ qrstat -ar 75
...
granted_slots_list             all.q@pc15370=2,all.q@pc15381=2
granted_parallel_environment   mpich slots 4

Then submitting a job with a different PE into this AR:

$ qsub -pe smp 2 -ar 75 test.sh
Your job 738 ("test.sh") has been submitted
$ qstat -g t
job-ID  prior   name       user         state submit/start at     queue                          master ja-task-ID
------------------------------------------------------------------------------------------------------------------
    738 0.75500 test.sh    reuti        r     03/12/2010 21:53:08 all.q@pc15370                  SLAVE
    738 0.75500 test.sh    reuti        r     03/12/2010 21:53:08 all.q@pc15381                  MASTER
                                                                  all.q@pc15381                  SLAVE

PE mpich has allocation_rule $round_robin, while PE smp has allocation_rule $pe_slots

This mismatch results in confusion. The job should either:

- be rejected with something like: PE mismatch between AR and acutal request

- or the PE from the qsub request should be allocated inside the granted slots of the AR

Change History (0)

Note: See TracTickets for help on using tickets.