[GE users] Runtime Design Automation?

Olesen, Mark Mark.Olesen at emcontechnologies.com
Fri Jul 20 20:23:58 BST 2007


    [ The following text is in the "iso-8859-1" character set. ]
    [ Your display is set for the "ISO-8859-10" character set.  ]
    [ Some special characters may be displayed incorrectly. ]

Hi Andreas and Rod,

> > For translation purposes:
> >  rc_limit  => max
> 
> the local max.

This would seem to correspond to 'assigned' in your scheduler code.

> >  rc_intern => utilized
> 
> I'm not clear about meaning of extern/intern.
> 
> Utilized is simply the amount licenses in-use according "qstat -s rs -
> r -xml"

Exactly. In the qlicserver, there is a distinction between 'internal'
resources (ie, resources being actively managed by the GridEngine) and
'external' resources (ie, resources that we would like to manage with the
GridEngine but they are currently being used by someone else).
The external resource could be a rogue user who has grabbed license
resource, or a legitimate user (ie, shared gui/simulation tokens).

> License juggler "demand" is the amount of missing licenses
> according to "qstat -s p -r -xml" output.

Okay.

> Writing it into a file means it can not be guaranteed that the
> adjustment
> really is in effect. Ideally the "assign" should be synchronuous

My idea was that the juggler 'assigned' value would be used as a limiter for
the qlicserver. The qlicserver uses the min(total-extern, limit).
Using -mattr directly would bypass the checking mechanism.


Since the last email, I have the scheduler core re-written in Perl and
started examining how it works or should be working based on your test
suite.

I think, however, cross-site license sharing is too complex for the juggler
approach. Here are a few possible issues, in no particular order:
Using the demand alone is insufficient. Even after filtering out jobs
submitted with a hold or jobs slated for later execution, we cannot be
certain that the lack of license resources is holding up a particular job.
It could be that the job requires a particular bit of hardware, or waiting
for a node with sufficient memory, etc. and simply assigning more licenses
to the cluster won't really help.

License granularity is also a bit of a problem. A demand of 8 could mean one
job with 8 slots or 2 jobs with 4 slots each. Should the cluster with the
highest demand or with the longest wait get the licenses first ...

In summary, the scheduling issues are complex enough and require much, much
more information than conveyed by simple metrics such as usage/demand etc.
We need a scheduler akin to that used within the GridEngine itself.

For what I can see, project Hedeby might be exactly the solution for this
problem, and it looks like it might be getting closer to release - the specs
were just uploaded as a pdf today
  http://hedeby.sunsource.net/files/documents/73/148/HedebyBook.pdf

In this case, I guess the Hedeby "resource provider" would be in charge of
granting the license resources to the individual clusters. From page 75,
however,
    Type: Represents the type of a resource, such as "host" or "license"
    (Currently only the type "host" is supported).
it looks like there is still some work to be done there.


I was also considering if there could be a way of using the GridEngine for
managing the problem indirectly, by introducing an extra GridEngine instance
'Provider' to coordinate the compute clusters 'Site1' and 'Site2'.

Here's the general idea: 

'Site1' and 'Site2' manage for the most part their own license pools.

'Provider' shadows the Site1 and Site2 license usage (similar to the
juggler), but would theoretical have access to both pools. 'Provide' has an
unlimited number of slots, but only allows jobs of type 'acquire'.

At regular intervals (as a daemon or as part of the load sensor) a
supervisor script scours the list of pending jobs and finds those that have
some form of a license request.
Since we can't tell if a missing license or some other restriction is
keeping the job from pending, we attempt to find out by letting another job
test the water form.



More information about the gridengine-users mailing list