Opened 9 years ago

Last modified 14 months ago

#520 new defect

IZ2583: INT resources don't have integer behavior

Reported by: templedf Owned by:
Priority: high Milestone:
Component: sge Version: 6.1u3
Severity: minor Keywords: Sun scheduling
Cc:

Description

[Imported from gridengine issuezilla http://gridengine.sunsource.net/issues/show_bug.cgi?id=2583]

        Issue #:      2583             Platform:     Sun      Reporter: templedf (templedf)
       Component:     gridengine          OS:        All
     Subcomponent:    scheduling       Version:      6.1u3       CC:    None defined
        Status:       NEW              Priority:     P2
      Resolution:                     Issue type:    DEFECT
                                   Target milestone: ---
      Assigned to:    andreas (andreas)
      QA Contact:     andreas
          URL:
       * Summary:     INT resources don't have integer behavior
   Status whiteboard:
      Attachments:

     Issue 2583 blocks:
   Votes for issue 2583:


   Opened: Fri May 23 13:47:00 -0700 2008 
------------------------


If I define a resource called ex and assign it to a queue with the value 1:

> qstat -f -q test1 -F ex
queuename                      qtype used/tot. load_avg arch          states
----------------------------------------------------------------------------
test1@ultra20                  BIP   0/4       0.12     sol-amd64
        qc:exclusive=1

I then submit a job against it that needs ex=0.4:

> qsub -l ex=0.4 $SGE_ROOT/examples/jobs/sleeper.sh 10
Your job 580 ("Sleeper") has been submitted
> qstat -f -q test1 -F ex
queuename                      qtype used/tot. load_avg arch          states
----------------------------------------------------------------------------
test1@ultra20                  BIP   1/4       0.13     sol-amd64
        qc:exclusive=0
    580 0.55500 Sleeper    dant         r     05/23/2008 13:44:11     1
> qsub -l ex=0.4 $SGE_ROOT/examples/jobs/sleeper.sh 10
Your job 581 ("Sleeper") has been submitted

qstat reports that the value of ex is now 0, which is exactly what I would
expect.  If, however, I submit a second job that needs 0.4 ex:

> qstat -f -q test1 -F ex
queuename                      qtype used/tot. load_avg arch          states
----------------------------------------------------------------------------
test1@ultra20                  BIP   2/4       0.13     sol-amd64
        qc:exclusive=0
    580 0.55500 Sleeper    dant         r     05/23/2008 13:44:11     1
    581 0.55500 Sleeper    dant         r     05/23/2008 13:44:15     1

both are scheduled, because the scheduler knows that I've really only used 0.4
ex, leaving 0.6 open for the second job.  That is floating point behavior, not
integer.  For integer resources, fractional usage must round up.  I would
suggest the caveat, though, that for parallel jobs, the usage from all the slave
tasks in a queue/host should be summed together before rounding up.

Change History (1)

comment:1 Changed 14 months ago by wish

  • Severity set to minor

The float behavior doesn't seem to be consistent between scheduler and qmaster functionality:
It appears the scheduler can issue instructions that the qmaster cannot fulfill due to a missing fractional
part of a consumable.

debiting 4294967296.000000 of memory on host node-u04a-005 for 16 slots would exceed remaining capacity of 68719476735.999992

(memory is a locally defined consumable of MEMORY type on our cluster).

Note: See TracTickets for help on using tickets.