[GE users] How is it possible for a consumable resource to be over-consumed.

Andy Schwierskott andy.schwierskott at sun.com
Tue Feb 1 08:37:23 GMT 2005


there are a couple of possibilites:

    - a "simple" bug - I think there was a bug in 5.3 where this could
      happen, but I don't recall exactly
    - the value of "mti_lic" was changed to a higher value and then set back
      to 8
    - a user found a whole, how he can cheat the system wiht qalter command.

You are using SGE 5.3 - which version is it exactly?

If the loglevel is set to "log_info" you should check in the qmaster and
schedd "messages" file for messages which indicate a problem.


> We are doing a simple license management scheme by defining a consumable resource
> that any job that runs that software, requests.  The resource is defined like this:
> #name            shortcut   type   value           relop requestable consumable default
> #--------------------------------------------------------------------------------------
> mti_lic          mtil       INT    0               <=    YES         YES        0
> We have one queue per host that is configured like this
> complex_values       mti_lic=1,cynth_lic=4,gridtest=1,coware_lic=0
> And we have set a global limit using qconf -me global, that is configured like this.
> complex_values             coware_lic=0,bg_lic=1,syn_lic=1,ncvlog_lic=1,mti_lic=8,cynth_lic=100,gridtest=100
> We have a total of 30 nodes, and when we launch one of our regression runs, we sometimes end up in a situation
> where we have 14 jobs that request mti_lic running at the same time.  If I do qstat -f <jobnum> on one of the jobs still
> waiting in the queue, it reports that gc:mti_lice=-6.  How is it possible for a job to get schedule when the resource has
> already reach its limit?
> How can this happen.
> Thanks,
> GT

To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
For additional commands, e-mail: users-help at gridengine.sunsource.net

More information about the gridengine-users mailing list