Opened 13 years ago
Last modified 5 years ago
#520 new defect
IZ2583: INT resources don't have integer behavior
Reported by: | templedf | Owned by: | |
---|---|---|---|
Priority: | high | Milestone: | |
Component: | sge | Version: | 6.1u3 |
Severity: | minor | Keywords: | Sun scheduling |
Cc: |
Description
[Imported from gridengine issuezilla http://gridengine.sunsource.net/issues/show_bug.cgi?id=2583]
Issue #: 2583 Platform: Sun Reporter: templedf (templedf) Component: gridengine OS: All Subcomponent: scheduling Version: 6.1u3 CC: None defined Status: NEW Priority: P2 Resolution: Issue type: DEFECT Target milestone: --- Assigned to: andreas (andreas) QA Contact: andreas URL: * Summary: INT resources don't have integer behavior Status whiteboard: Attachments: Issue 2583 blocks: Votes for issue 2583: Opened: Fri May 23 13:47:00 -0700 2008 ------------------------ If I define a resource called ex and assign it to a queue with the value 1: > qstat -f -q test1 -F ex queuename qtype used/tot. load_avg arch states ---------------------------------------------------------------------------- test1@ultra20 BIP 0/4 0.12 sol-amd64 qc:exclusive=1 I then submit a job against it that needs ex=0.4: > qsub -l ex=0.4 $SGE_ROOT/examples/jobs/sleeper.sh 10 Your job 580 ("Sleeper") has been submitted > qstat -f -q test1 -F ex queuename qtype used/tot. load_avg arch states ---------------------------------------------------------------------------- test1@ultra20 BIP 1/4 0.13 sol-amd64 qc:exclusive=0 580 0.55500 Sleeper dant r 05/23/2008 13:44:11 1 > qsub -l ex=0.4 $SGE_ROOT/examples/jobs/sleeper.sh 10 Your job 581 ("Sleeper") has been submitted qstat reports that the value of ex is now 0, which is exactly what I would expect. If, however, I submit a second job that needs 0.4 ex: > qstat -f -q test1 -F ex queuename qtype used/tot. load_avg arch states ---------------------------------------------------------------------------- test1@ultra20 BIP 2/4 0.13 sol-amd64 qc:exclusive=0 580 0.55500 Sleeper dant r 05/23/2008 13:44:11 1 581 0.55500 Sleeper dant r 05/23/2008 13:44:15 1 both are scheduled, because the scheduler knows that I've really only used 0.4 ex, leaving 0.6 open for the second job. That is floating point behavior, not integer. For integer resources, fractional usage must round up. I would suggest the caveat, though, that for parallel jobs, the usage from all the slave tasks in a queue/host should be summed together before rounding up.
Note: See
TracTickets for help on using
tickets.
The float behavior doesn't seem to be consistent between scheduler and qmaster functionality:
It appears the scheduler can issue instructions that the qmaster cannot fulfill due to a missing fractional
part of a consumable.
debiting 4294967296.000000 of memory on host node-u04a-005 for 16 slots would exceed remaining capacity of 68719476735.999992
(memory is a locally defined consumable of MEMORY type on our cluster).