Opened 10 years ago

Last modified 9 years ago

#659 new enhancement

IZ2985: uniformly distributed hostlist for exclusive jobs

Reported by: lori Owned by:
Priority: normal Milestone:
Component: sge Version: 6.2u3
Severity: Keywords: scheduling
Cc:

Description

[Imported from gridengine issuezilla http://gridengine.sunsource.net/issues/show_bug.cgi?id=2985]

        Issue #:      2985             Platform:     All           Reporter: lori (lori)
       Component:     gridengine          OS:        All
     Subcomponent:    scheduling       Version:      6.2u3            CC:    None defined
        Status:       NEW              Priority:     P3
      Resolution:                     Issue type:    ENHANCEMENT
                                   Target milestone: ---
      Assigned to:    andreas (andreas)
      QA Contact:     andreas
          URL:
       * Summary:     uniformly distributed hostlist for exclusive jobs
   Status whiteboard:
      Attachments:

     Issue 2985 blocks:
   Votes for issue 2985:


   Opened: Wed Apr 8 06:35:00 -0700 2009 
------------------------


With SGE 6.2u3 a new feature will be available to submit a job
exclusive to an host (issue 2629). This is important for large
parallel jobs across a couple of compute nodes to get a good
performance. But jobs which request this new feature will not
automatically uniformly distributed across the nodes.

Example: A exclusive MPI job requests 9 slots on a cluster of
compute nodes each with 4 slots.
Allocation rule $fill_up will distribute the job 4+4+1.
Allocation rule $round_robin maybe distribute the job 1+1+1+1+1+1+1+1+1
But the best solution will be 3+3+3 on equal nodes.

Of course a parallel environment with an allocation rule=3 could
be defined. But on a cluster with different nodes (memory, number
of slots, ...) the user can not decide at submit time which will
be the best allocation rule without limit his job to a part of the
whole cluster.

   ------- Additional comments from reuti Mon Dec 7 17:11:29 -0700 2009 -------
The syntax could be:

$even
$even:2 (give me multiple of 2 per node, and same on all machines)
$even:2,4,8,16 (give me only 2,4,8,16 per machine, and same on all machines)

The case exactly 2 on all machines can be put in the fixed "allocation_rule 2".

   ------- Additional comments from reuti Mon Dec 7 17:14:31 -0700 2009 -------
This is related to http://gridengine.sunsource.net/issues/show_bug.cgi?id=2332

   ------- Additional comments from reuti Mon Dec 7 17:39:39 -0700 2009 -------
The order of the arguments could be honored:

$even:2,4,8,16

$even:16,8,4,2 (first try to collect 16 per node)

Change History (0)

Note: See TracTickets for help on using tickets.