Opened 11 years ago

Last modified 9 years ago

#571 new enhancement

IZ2719: allow wildcards and ranges for more flexible PE definitions

Reported by: pollinger Owned by:
Priority: low Milestone:
Component: sge Version: 6.2
Severity: Keywords: qmaster
Cc:

Description

[Imported from gridengine issuezilla http://gridengine.sunsource.net/issues/show_bug.cgi?id=2719]

        Issue #:      2719             Platform:     All           Reporter: pollinger (pollinger)
       Component:     gridengine          OS:        All
     Subcomponent:    qmaster          Version:      6.2              CC:
                                                                             [_] reuti
                                                                             [_] Remove selected CCs
        Status:       NEW              Priority:     P4
      Resolution:                     Issue type:    ENHANCEMENT
                                   Target milestone: ---
      Assigned to:    ernst (ernst)
      QA Contact:     ernst
          URL:
       * Summary:     allow wildcards and ranges for more flexible PE definitions
   Status whiteboard:
      Attachments:

     Issue 2719 blocks:
   Votes for issue 2719:


   Opened: Thu Sep 4 09:45:00 -0700 2008 
------------------------


From a mail discussion:

A more flexible range specification would be also really useful for parallel
environments. For some calculations we need specialized domain decomposition. We
might, for example, decompose the domain into 12, 8, 6, 4 subdomains.  In which
case one can't simply specify "-pe foo 4-12" and hope for the best, but a finer
grained control is needed:
Eg, "-pe foo 4,6,8,12".

Of course, this only makes sense when we can simultaneously specify masterq
resources too.
----------------------------
And some more flexibility with wildcard PEs and ranges.

A customer told me that he would like to be able to ensure that when a range is
requested for a PE job which can go to different machines types on machine type
A not more than 16 slots should be granted but on machine type B up to 32 slots
may be granted. Likely the only clean solution would be to offer a true logical
operation support for any types of resource requests:

qsub "-pe sw1_machA 4-16 | -pe sw1_machB 4-32"

Probably a more easy to implement solution would be to support a range
specification in the PE configuration

qconf -sp sw1_machA
[...]
slot_range 4-16

qconf -sp sw1_machB
[...]
slot_range 4-32

and the do a wildcard submission:

qsub -pe sw1_mach* 4-32

Nevertheless that's now really a different story than the original question
about context variables.
---------------------------
Yep, we spoke already about it. Although I see the idea to put it (the ranges
per PE) in the PE definition, you end up with more and more (nearly) identical
PEs, especially if you want to limit the ranges per user(group) in addition
(hence using user_lists) - which will give four PEs in this case potentially.
With one change to the start_proc_args it might also be necessary to adjust
several PEs in the same way.

Somehow I still have the feeling to put it in one resource quota is easier to
handle - one resource quota with 4 rules instead of 4 PEs. But I also see a
chance to combine it: the parser for quota-style limits is already in SGE, and
so the interpreter. What about (and replacing slots, user_lists, xuser_lists):

$qconf -sp smp
pe_name           smp
limits            { \
                 limit to slots=150    # former slots setting for this PE; \
                 limit jobs {*} users @students to slots=4   # maximum slots per
job for students; \
                 limit jobs {*} users @staff to slots=32    # in this case
obsolete, as 32 is already the maximum set below; \
                 limit jobs {*} hosts @group_a to slot_range=4-16; \
                 limit jobs {*} hosts @group_b to slot_range=4-32    # in the
last line w or w/o ; \
                 }
start_proc_args   /bin/true
stop_proc_args    /bin/true
allocation_rule   $round_robin
control_slaves    FALSE
job_is_first_task TRUE
urgency_slots     min

One side effect would be, that the binding of jobs to nodes connected to one and
the same switch in the cluster could be coded as:

                 limit jobs {*} hosts @group_a to slot_range=4-16; \
                 limit jobs {*} hosts @group_b to slot_range=4-16; \
                 limit jobs {*} hosts @group_c to slot_range=4-16; \

Hence avoiding the former wildcard-PE solution. If you don't care where the jobs
run leave it out or use:

                 limit jobs {*} hosts @group_a, @group_b, @group_c to
slot_range=4-16; \

In contrast to resource quotas, all limits must be met in the set.

-- Reuti

   ------- Additional comments from reuti Mon Sep 8 03:31:06 -0700 2008 -------
Added myself to cc.

Change History (0)

Note: See TracTickets for help on using tickets.