Opened 16 years ago

Closed 9 years ago

#105 closed enhancement (invalid)

IZ615: Admin doc has to reflect changes of the complex matching

Reported by: sgrell Owned by:
Priority: normal Milestone:
Component: sge Version: current
Severity: minor Keywords: Sun SunOS doc
Cc:

Description

[Imported from gridengine issuezilla http://gridengine.sunsource.net/issues/show_bug.cgi?id=615]

        Issue #:      615              Platform:     Sun           Reporter: sgrell (sgrell)
       Component:     gridengine          OS:        SunOS
     Subcomponent:    doc              Version:      current          CC:    None defined
        Status:       NEW              Priority:     P3
      Resolution:                     Issue type:    ENHANCEMENT
                                   Target milestone: ---
      Assigned to:    markob (markob)
      QA Contact:     skonta
          URL:
       * Summary:     Admin doc has to reflect changes of the complex matching
   Status whiteboard:
      Attachments:

     Issue 615 blocks:
   Votes for issue 615:


   Opened: Wed Nov 12 00:46:00 -0700 2003 
------------------------


The matching of requested complexes from jobs with
the provided ones has been changed and a new
complex type has been introduced.

Short doc: (also in:
doc/devel/rfe/resource_attributes.txt)

1. Attributes:
--------------

   - We have the possibility to specify attributes
on global level, host level,
     and queue level.
   - The values for an attribute can be fixed or
changeable.
   - The values can be a load value, custom
defined or a resource limit.
   - Resource limits exist only on queue level
   - Load values only on global or host level
   - The type of an attribute can be: string,
host, cstring, int,double,
     boolean, memory, and time.
   - A consumable can have a default value, thus
the user can, but does not
     have to specify the attribute.
   - An attribute can be requestable, has to be
requested, or cannot be
     requested at all.
   - An attribute can be build-in or a user defined.
   - Most load values and all of the resource
limits are build in.
   - A user can define consumables, load values,
or fixed values.
   - An attribute has one of many relational
operations:  ==, !=, >=, =<, >,<

   - An attribute can be a per job attribute or
per slot attribute:
      - all load values and consumables are per
job attributes except string,
        cstring, host  values. They are per slot
attributes.
      - all resource limits and user defined fixed
values are per slot
        attributes.
      - An attribute can only be a job attribute
or a slot attribute, but not
        both.

   The user can change every aspect of an
attribute at any time. The current
   definition of an attribute looks like:
   - name
   - Short cut
   - Type
   - Consumable
   - Relational operator
   - Requestable
   - default value

   Using per slot attributes:
   --------------------------

      All fixed values in the system are per slot
attributes. This means, that
      a parallel job existing of 4 parallel tasks
(needing 4 slots to run)
      requires the attribute for each task.

      Example E1:
         one job j1 requests t01 = 10 with 4 slots
         queue q1 has 5 slots and t01 is set to <=20
         j1 does fit on q1 since it has enough
slots and t01 allows to run
         jobs with t01 requests between 0 and 20.
         The configuration of q1 after j1 has
started is slots = 1 and
         t01 <= 20.

   Using per job attributes:
   -------------------------

      All changeable values in the system are per
job attributes like the
      "slots" attribute in E1. This means, that a
parallel job requests a
      resource n times (n is the number of slots
it will use).

      Example E2:
         one job j2 requests t02 = 20 with 4 slots
         queue q2 offers 5 slots with t02 <= 40
         queue q3 offers 1 slot with t02 <= 40
         queue q4 offers 3 slots with t02 <= 40
      The system will start two instances of the
job on q2, one on q3 and
      one on q4
      The configuration of the queues afterwards
looks like:
         q2 offers 3 slots with t02 <= 0 (q2.t02 -
2*j2.t02)
         q1 offers 0 slots with t02 <= 20 (q2.t02
- 1*j2.t02)
         q3 offers 2 slots with t02 <= 20(q2.t02 -
1*j2.t02)

      This example shows how consumables are
handled. With load values is
      will take some time, before they are
updated, depending on the
      load_value_report interval.

      It is possible to override attributes on a
lower level or on the same
      level.  More about it later.

      The fact, that all consumables are per job
attributes posses a problem,
      when one wants to do licence managing with
parallel jobs. Therefor it
      would be nice to have it configurable if a
consumable is a per job
      attribute or per slot attribute.

      Wor kpackage:
         => Decide, what is the right way of doing it
         => Adding a flag which can be set by a
user, if an attribute is a per
            slot or perjob attribute.



2. Restrictions:
----------------

   Not all of the above described combinations
make sense. Till now, there are
   no restrictions on how an attribute is defined,
but the code working on them
   has the restrictions already build in.
Therefore it will aid the user in
   configuring SGE when the system allows only
valid specifications:

   name              : has to be unique
   Short cu       : has to be unique
   Type      : every type from the list (string,
host, cstring, int, double,
                                          boolean,
memory, time, restring)
   Consumable : can only be defined for: int,
double, memory, time
                If a consumable is not
requestable, it has to have a default
                value.
                If a consumable is forced, it must
not have a default value.

   Relational operator:
      - for consumables:              only <=
      - for non consumables:
         - string, host, cstring:     only ==, !=
         - boolean:                      only ==
         - int, double, memory, time: ==, !=, <=,
<, =>, >

   Requestable     : for all attribute
   default value        : only for consumables (a
default value is a default request)

   The qmon interface should only provied valid
options. The choice of the
   type limites the choice of operators and if it
can be a consumable or not.
   Haveing a consumable also limites the
relational operators to one. This
   makes it easier and more convieniend for the
user to add new attributes.
   Default values can only be added, when it makes
sense (for consumables).

Build in values:
----------------

   Besides the overall attribute restrictions, we
have one additional one for
   the system build-in attributes. One can not
change the type of a build in
   attribute. The system relies on the type and
will not function anymore, if
   one changes the type. A build-in value can also
not be deleted.

   The only exception are the strings. A string
can be changed into a cstring
   or restring and back.

3. Overriding attributes:
-------------------------

   In general an attribute can be overridden on a
lower level
   - global by hosts and queues
   - hosts by queues
   and load values or resource limits on the same
level. Overriding a per slot
   attribute with a per slot attribute and a per
job attribute with an per slot
   attribute is no problem. Based on the
specification does a per job attribute
   never be override with a per slot attribute.
But a per job attribute can be
   overridden with a per slot attribute. In this
case the per slot attribute
   changes into a  per job attribute. This
happens, when a load value is
   overridden with a fixed value.

   We have one limitation for overriding
attributes based on its relational
   operator:
   - !=, == operators can only be overridden on
the same level, but not on a
   lower level
   - >=, >, <=, < operators can only be
overridden, when the new value is more
   restrictive than the old one.

   Examples:
   1. We have a load value arch on host level. One
can override it in the host
      definition with another value, but not in a
queue.
   2. We have a load value mem_free with a relop
<= on host level. One can
      override it on host or queue level with a
value, which is smaller than
      the reported one.
      mem_free: custom / load report / result:
               1 GB    /    4 GB     /    1 GB
               1 GB    /    0.9 GB   /    0.9 GB

   The reason why we have the override limitation
is buried in the algorithm
   how we match the job requests with available
resources. The algorithm is
   strict hierarchical, which means, if it finds a
attribute on one level,
   which does not match, the other levels are not
further evaluated. It starts
   with the global host and ends with a queue.
When a  attribute is missing on
   one level it will go one with the next levels.
But an existing attribute,
   which does not match results in an abort.


5. Scheduler attribute matching:
--------------------------------

   As written before, the matching of attribute
requests by a job a matched in
   in a strict hierarchy. When a match fails, the
underlying levels are not
   evaluated any further. Right now, this is done
for every job, even so the
   jobs might be in the same job category, which
means, that they have the same
   requests.
   To speed up this process, one can store the
information, which queue cannot
   run which job category. When this is known, the
jobs are only tested against
   the queues, which were capable of running the
jobs from the previous
   dispatch cycle. List of queues to test will get
shorter and shorter, while
   jobs are dispatched.

   The same is true for soft requests. Once all
queues are validated and the
   number of mismatches are computed, they are the
same for all other jobs in
   the same job category.  This saves a lot of
matching time with the soft
   requests.


   String matching:
   ----------------

      The string matching has some specialties. A
string can have one of three
      different types:
         - plain string
         - caseless string
         - regular expression string

      1. Plain strings (STRING):
         Matches only, when the requested and the
provided string are exactly
         the same.

      2. Caseless strings (CSTRING):
         The upper- and lowercase of the
characters in a string is ignored.

      3. Regular expression string (RESTRING):
         The user can use a regular expression to
ask for a resource. The
         syntax follows the following rules:
         - "*"   : matches any character and any
number of chars (between 0
                   and inv).
         - "?"   : matches any character. It
cannot be no character
         - "."   : is the character ".". It has no
other meaning
         - "\"   : escape character. "\\" = "\",
"\*" = "*", "\?" = "?"
         - "[xx]": specifies an array or a range
of allowed characters
         - "|"   : logical "or". Can only be used
on the highest level and
                   cannot be escaped.

         Not supported:
         - "x+"      : to specify, that the
character "x" has to appear at least
                       once
         - "[xx|yy]" : to specify xx or yy

         Examples:
            -l arch="linux|solaris"    : results
in "arch=linux" OR

"arch=solaris"
            -l arch="[linux|solaris]"  : results
in "arch=[linux" OR

"arch=solaris]"

   Result caching:
   ---------------

   When ever resource matching is done with jobs,
which have pre-calculated
   job categories, the matching results will be
stored in the job categories.
   This can be done because all jobs in the same
category have the same requests,
   the same user, the same department,...

   What is cached depends on the job kind (if it
is a job with only hard requests,
   or if its one with hard, soft, pe and other
requests)

   Jobs with only hard requests:
      All queues and hosts on which the job cannot
run are stored in the job
      category. This information is used to limit
the possible target queues.

   other jobs:
      All unfitting queues and the soft violation
results are stored in the job
      category. This means, that the soft
violations are only computed once and
      reused for all other jobs in the same
category. The queue information
      limits the possible target queues.

   ------- Additional comments from andreas Thu Oct 21 07:44:22 -0700 2004 -------
Assign Admin/User/Install guide related issues to Mark O'Brien.

   ------- Additional comments from surajp Mon Mar 23 01:47:11 -0700 2009 -------
Reassigning issue to Sandra.

Change History (1)

comment:1 Changed 9 years ago by dlove

  • Resolution set to invalid
  • Severity set to minor
  • Status changed from new to closed

Can't fix manual bugs.

Note: See TracTickets for help on using tickets.