Opened 14 years ago

Last modified 9 years ago

#245 new defect

IZ1609: qsub/qrsh -w e validation should merely warn about load values being ignored

Reported by: agrajag Owned by:
Priority: low Milestone:
Component: sge Version: 6.0u3
Severity: Keywords: clients
Cc:

Description

[Imported from gridengine issuezilla http://gridengine.sunsource.net/issues/show_bug.cgi?id=1609]

        Issue #:      1609             Platform:     All      Reporter: agrajag (agrajag)
       Component:     gridengine          OS:        All
     Subcomponent:    clients          Version:      6.0u3       CC:    None defined
        Status:       REOPENED         Priority:     P4
      Resolution:                     Issue type:    DEFECT
                                   Target milestone: ---
      Assigned to:    andreas (andreas)
      QA Contact:     roland
          URL:
       * Summary:     qsub/qrsh -w e validation should merely warn about load values being ignored
   Status whiteboard:
      Attachments:

     Issue 1609 blocks:
   Votes for issue 1609:


   Opened: Fri May 6 11:32:00 -0700 2005 
------------------------


If you have a qsub or qrsh command and request any resource that's reported as a
"load" value (except for 'arch'), validation on your job will fail.  When doing
the verbose validation, it claims that the resource is unknown, even though
'qhost -F' and 'qstat -F' show it.  This causes a major problem for sights that
want to put '-w e' in their system-wide sge_request file.

[sean@ymir sean]$ qrsh -w v -l mf=50M
error: Job 434 (-l mem_free=50M) cannot run in queue instance "lowprio.q@node1"
because job requests unknown resource (mem_free)
error: Job 434 (-l mem_free=50M) cannot run in queue instance "lowprio.q@node4"
because job requests unknown resource (mem_free)
error: Job 434 (-l mem_free=50M) cannot run in queue instance "lowprio.q@node5"
because job requests unknown resource (mem_free)
error: Job 434 (-l mem_free=50M) cannot run in queue instance "lowprio.q@node7"
because job requests unknown resource (mem_free)
error: Job 434 (-l mem_free=50M) cannot run in queue instance "lowprio.q@node8"
because job requests unknown resource (mem_free)
error: verification: no suitable queues
[sean@ymir sean]$ qhost -F mf -h node1
HOSTNAME                ARCH         NCPU  LOAD  MEMTOT  MEMUSE  SWAPTO  SWAPUS
-------------------------------------------------------------------------------
global                  -               -     -       -       -       -       -
node1                   lx24-x86        2  0.00 1002.1M  132.3M    4.0G     0.0
    Host Resource(s):      hl:mem_free=869.824M

   ------- Additional comments from andreas Mon May 9 01:48:16 -0700 2005 -------
That is an old problem due to load values not being considered when this
checking is done. Even though this may sound strange to you I still believe that
behaviour is correct: Load values always change so it isn't possible to use them
for *positively* validating jobs will be able to run.

The workaround is to configure maximum mem_free values as complex_value of your
hosts. That value *is* considered by Grid Engine.

Closing this issue.

   ------- Additional comments from agrajag Mon May 9 04:22:55 -0700 2005 -------
I can understand your argument that its hard to check against load values.

As such, if a load value is requested, the validation code should ignore the
load value instead of failing the validation.  Causing a job to fail to be
submitted when its a valid job that can run is a very serious bug!

One suggestion might be to print a warning about load values if one is
requested, but let the job be submitted anyways.

mem_free is not my primary concern.  Some load sensors that we've added are, and
this also affects them.  For various reasons (including ease of configuration),
its much prefered to use load sensors than trying to set the value in several
hundred host configs.

   ------- Additional comments from andreas Mon May 9 05:24:31 -0700 2005 -------
You observe the behaviour because load values *are* ignored. Without these load
values however Grid Engine has no information at all about resource availability
and thus can not positively validate your jobs will become dispatchable :-(

Nevertheless I understand your desire. I believe you wish e.g. capacity based
resources be fully checked and used as *hard* criterion and *concurrently*
warnings be spit out in case load value information isn't sufficient to
positively validate jobs will be dispatchable.

While this sounds like a reasonable request I still do not know how Grid Engine
should treat such jobs if an execution host is down for a while and load values
are timed out. The problem is this case can't be distinguished from the case of
an execution hosts for which a certain load value *never* has been reported. The
point is if non-existing load values always cause a waiver for resource request
hard checking with "-w e" what do you actually expect to be validated?

Do you think a "-w e" mechanism would be sufficient if it uses merely existing
complex_values information to outrule jobs are wrongly accepted despite it is
known they can not be dispatched?

Possibly it's an intermediate solution for you to use

 # qconf -mattr exechost complex_values mem_free=2180M <hostname>

one-time during load sensor start-up to ensure maximum resource values are
known to Grid Engine. Also maybe "-w w" helps somewhat.

Note there is some relation concerning the contents with #772.

Cheers, Andreas

   ------- Additional comments from agrajag Mon May 23 14:06:52 -0700 2005 -------
Another suggestion is to have an option configuration parameter specifying how
verification works.  That way you can set it to be 'strict' or not-strict.  If
its strict, it will only accept the job if it knows for certain that the job
will be able to run.  If its not-strict, it will deny any job that its certain
it won't be able to run.   There is a subtle and important difference there.

You asked a good question about what do I expect to be validated with the loose
or non-strict version.  You are right, it would waive or ignore any request
dealing with a load sensor value.  However, it would enforce requests dealing
with all other complex values.

My cluster has some complex values 'hard-coded' in the queue config.  And others
are produced by load sensors.  We've had some users request one of those
hard-coded values that means their job can only run on a small number of nodes,
none of which they have permission to run jobs on.  If we were using loose
verification, it would reject their job.  Right now with no verification, it
accepts their job, and it sits in 'qw' state until the user notices that its not
running for some reason.  Likewise, we've had cases where a user would request a
resource the limits them to a small number of queues, then request a parallel
environment which only uses another small number of queues, with no queues in
common.  Likewise, this job will sit in 'qw' state forever.  But if we have
loose verification, it wouldn't be accepted.

Unfortunately, '-w w' won't help as it'll give too many false positives and
confuse my users.

I will however consider your load sensor script suggestion.  Although it means
making all the compute nodes admin hosts, which I was hoping to avoid.

   ------- Additional comments from andreas Thu Jun 16 09:37:08 -0700 2005 -------
Though I agree it would be a possibility to have the administrator decide how
Grid Engine should behave if those hosts are down, but I believe it is a dead-end.

It seems the only reasonable solution would be to allow consumable capacities be
specified in a more flexible manner. E.g.

      :
   complex_values mem_free=$mem_total
      :

would you agree changing this into an RFE ;-)

   ------- Additional comments from andreas Fri Jun 17 06:39:04 -0700 2005 -------
The RFE that would adress this already exists as issue 373.

Lowering priority.

Though the workaround requires execution hosts to become admin hosts
but this seems to be not unreasonable.

Also it is anyways possible to run the qconf command doing the

   :
complex_values mem_free=$mem_total
   :

for all execution nodes within the cluster from a single admin host. E.g. the
qhost output could be used to iterate over all execution hosts and do the change.

   ------- Additional comments from agrajag Fri Jun 17 06:42:50 -0700 2005 -------
I'm not sure I see how 373 is the same thing.

As far as making compute hosts admin hosts.. with the default install of no
security enabled, allowing a box normal users can login to to be an admin host
is opening up a root vulnerability on any compute node.  I would consider this
to be a major issue.

   ------- Additional comments from andreas Fri Jun 17 06:58:19 -0700 2005 -------
I agree making execution hosts admin hosts isn't a good solution. But you don't
need to do this. All you need is a script that uses qhost output (columns #1 and
#5) and runs for each line the corresponding

   qconf -mattr exechost complex_values mem_free=#5 #1

that script need not be run from an execution host. If it is run each time
execution hosts change you have a reasonable workaround without opening any
vulnerability issues at all.

   ------- Additional comments from sgrell Tue Dec 6 03:42:49 -0700 2005 -------
Changed subcomponent.

Stephan

Change History (0)

Note: See TracTickets for help on using tickets.