[GE users] Consumable complex not being consumed?

Daire Byrne Daire.Byrne at framestore-cfc.com
Mon Jun 11 17:58:07 BST 2007


    [ The following text is in the "utf-8" character set. ]
    [ Your display is set for the "ISO-8859-10" character set.  ]
    [ Some characters may be displayed incorrectly. ]

Andy/Reuti,

Okay, I may have been smoking crack and/or am a complete idiot. I expected that looking at the complex count using "qconf -se global" would tell me the total amount of licenses available AND also take into account the currently consumed licenses. I think I got myself into this confused state by using the "Olsen" FlexLM license daemon and having it periodically update the license counts. For some reason I then thought that using "-hard -l license_mayaT=1" would also update the count in a similar way. So I've basically been looking at the output of "qconf -se global" each time I submit a job expecting the value to decrement. Either way I've just tested what I was trying to do again and it is correctly blocking jobs if I oversubscribe the licenses. Then if some free up outside of SGE the FLexLM script updates the total count and a few more jobs launch. Just what I wanted all along - I was just looking at the wrong metrics to verify it was working right.

I apologise for the email spam to the list and shall endevour to stay away from the crack pipe from now on..... Thanks for the help - the more everyone said it worked fine for them the more I knew I was probably being an idiot!

Regards,

Daire

----- "Andy Schwierskott" <andy.schwierskott at sun.com> wrote:
> Daire,
> 
> > Andy,
> >
> > I am using SGE 6.1 on a mix of x86 and x86_64 servers/clients.
> Downloaded and installed last week.
> 
> I can confirm this is working my 6.1 cluster as well (even with your
> original more complicated command line where the job requests 5
> licenses).
> Are there any errors in your qmaster/scheduler messages files?
> 
> I think I don't quite understand what you were saying here:
> 
> --------------------------
> Typical output is:
> 
> HOSTNAME                ARCH         NCPU  LOAD  MEMTOT  MEMUSE 
> SWAPTO  SWAPUS
> -------------------------------------------------------------------------------
> global                  -               -     -       -       -      
> -       -
>      Host Resource(s):      gc:license_mayaT=5.000000
> lust1b.prod.local       lx24-amd64      2  0.24    5.8G    2.2G   
> 2.0G 8.0K
>      Host Resource(s):      gc:license_mayaT=5.000000
> 
> etc. for all hosts. It definitely there as asking for 6 licenses
> blocks the
> job from running. But once it does run the
> value remains the same. Maybe I should just try recreating the
> complex?
> --------------------------
> 
> What job is aksing for 6 licenses? Can you check with
> 
>     % qstat -r
> 
> what your running and pending jobs do request?
> 
> Andy
> 
> > Daire
> >
> > ----- "Andy Schwierskott" <andy.schwierskott at sun.com> wrote:
> >> Daire,
> >>
> >> I think you have not told us which version you are using?
> >>
> >> Andy
> >>
> >>> Daniel,
> >>>
> >>> ----- "Daniel Templeton" <Dan.Templeton at Sun.COM> wrote:
> >>>> You're using the BOOL syntax when requesting an INT consumable.
> >> You
> >>>> have to give it the number of licenses that you want, like Andy
> >> and
> >>>> Reuti said.  I'm actually surprised that you're not seeing more
> of
> >> an
> >>>
> >>> Sorry for the confusion, I think you are referring to the
> incomplete
> >> command I used in the last email. Am in fact using "-hard -l
> >> license_mayaU=1". It's really bugging me now - I have tried
> restarting
> >> all daemons etc. but no change. I think I may have to resort to a
> >> complete reinstall from scratch. The license complex works fine in
> >> that it correctly blocks jobs when there is not enough licenses
> >> available - but the automatic decrement/increment (i.e.
> consumable)
> >> attribute is being ignored.
> >>>
> >>> The only slightly non-standard thing about my setup is that
> >> /opt/sge/default/common is a symlink via NFS to the Qmaster server
> on
> >> all exec hosts. And then /opt/sge/default/spool dirs are local on
> all
> >> machines. I see no reason why this should be related to my
> problem.
> >>>
> >>> Regards,
> >>>
> >>> Daire
> >>>
> >>>> error.  The code in the qmaster looks like it's supposed to
> treat
> >> a
> >>>> valueless INT consumable as an error.
> >>>>
> >>>> Daniel
> >>>>
> >>>> Andy Schwierskott wrote:
> >>>>> Daire,
> >>>>>
> >>>>> I can confirm was Reuti was saying. After submitting the job
> >>>>>
> >>>>>   qsub -hard -l license_mayaU=1 ....
> >>>>>
> >>>>> everything works as expected. I had the counters set to 0
> first,
> >>>>> submitted
> >>>>> the jobs and then increased the counter of one attribute to 10
> in
> >>>> the
> >>>>> global
> >>>>> host config.
> >>>>>
> >>>>> I tried it with 6.0u10.
> >>>>>
> >>>>> Andy
> >>>>>
> >>>>>> Reuti,
> >>>>>>
> >>>>>> Well I've tried deleting the complexes and recreating them but
> >>>> still
> >>>>>> no joy. Here's the commands used to create the complexes:
> >>>>>>
> >>>>>> qconf -mc (then add the following)
> >>>>>> license_mayaC       l_mC       INT         <=    YES
> >>>>>> YES        0        0
> >>>>>> license_mayaT       l_mT       INT         <=    YES
> >>>>>> YES        0        0
> >>>>>> license_mayaU       l_mU       INT         <=    YES
> >>>>>> YES        0        0
> >>>>>>
> >>>>>> qconf -mattr exechost complex_values
> >>>>>> license_mayaC=0,license_mayaT=0,license_mayaU=0 global
> >>>>>>
> >>>>>> Then I give one of the licenses some value and submit a job
> with
> >>>>>> "-hard -l license_mayaU" (say) and I see that the total count
> is
> >>>>>> unchanged while the job runs or quits. Perhaps I can turn up
> the
> >>>>>> debugging to see where things are failing? There is nothing of
> >>>>>> interest in any of the "messages" files as far as I can make
> >> out.
> >>>>>>
> >>>>>> I haven't found anything else wrong with installation - just
> >>>>>> consumables for now. I've not had this issue before so I'm not
> >>>> ruling
> >>>>>> out a bad install on my part. Any more info I can give?
> >>>>>>
> >>>>>> Regards,
> >>>>>>
> >>>>>> Daire
> >>>>>>
> >>>>>> ----- "Reuti" <reuti at staff.uni-marburg.de> wrote:
> >>>>>>> Am 08.06.2007 um 17:09 schrieb Daire Byrne:
> >>>>>>>
> >>>>>>>> Reuti,
> >>>>>>>>
> >>>>>>>>> what is:
> >>>>>>>>>
> >>>>>>>>> qhost -F license_mayaT
> >>>>>>>>>
> >>>>>>>>> showing?
> >>>>>>>>
> >>>>>>>> Typical output is:
> >>>>>>>>
> >>>>>>>> HOSTNAME                ARCH         NCPU  LOAD  MEMTOT
> >> MEMUSE
> >>>>>>>> SWAPTO  SWAPUS
> >>>>>>>>
> >>>>>>>
> >>>>
> >>
> ----------------------------------------------------------------------
> >>>>>>>
> >>>>>>>> ---------
> >>>>>>>> global                  -               -     -       -
> >>>>>>>> -       -       -
> >>>>>>>>     Host Resource(s):      gc:license_mayaT=5.000000
> >>>>>>>> lust1b.prod.local       lx24-amd64      2  0.24    5.8G
> >> 2.2G
> >>>>>>>
> >>>>>>>> 2.0G    8.0K
> >>>>>>>>     Host Resource(s):      gc:license_mayaT=5.000000
> >>>>>>>>
> >>>>>>>> etc. for all hosts. It definitely there as asking for 6
> >> licenses
> >>>>>>>> blocks the job from running. But once it does run the value
> >>>> remains
> >>>>>>>
> >>>>>>>> the same. Maybe I should just try recreating the complex?
> >>>>>>>
> >>>>>>> I don't see this, for me it's perfectly working - even if I
> >> create
> >>>> a
> >>>>>>>
> >>>>>>> complex with name license_mayaT (also SGE 6.1). - Reuti
> >>>>>>>
> >>>>>>>
> >>>>>>>> Regards,
> >>>>>>>>
> >>>>>>>> Daire
> >>>>>>>>
> >>>>>>>>
> >>>>>>>> ----- "Reuti" <reuti at staff.uni-marburg.de> wrote:
> >>>>>>>>> Hi,
> >>>>>>>>>
> >>>>>>>>> Am 08.06.2007 um 16:25 schrieb Daire Byrne:
> >>>>>>>>>
> >>>>>>>>>> Hi,
> >>>>>>>>>>
> >>>>>>>>>> I setup some consumable complexes for licenses but they
> >> aren't
> >>>>>>>>>> getting decremented when a job requests a "license".
> >>>>>>>>>>
> >>>>>>>>>> # qconf -sc | grep license
> >>>>>>>>>> #name               shortcut   type        relop
> requestable
> >>>>>>>>>> consumable default  urgency
> >>>>>>>>>>
> >>>>>>>>>
> >>>>>>>
> >>>>
> >>
> #--------------------------------------------------------------------
> >>>>>>>
> >>>>>>>>> -
> >>>>>>>>>
> >>>>>>>>>> --------------------
> >>>>>>>>>> license_mayaC       l_mC       INT         <=    YES
> >>>>>>>>>> YES        0        0
> >>>>>>>>>> license_mayaT       l_mT       INT         <=    YES
> >>>>>>>>>> YES        0        0
> >>>>>>>>>> license_mayaU       l_mU       INT         <=    YES
> >>>>>>>>>> YES        0        0
> >>>>>>>>>>
> >>>>>>>>>> # qconf -se global | grep license
> >>>>>>>>>> license_mayaC=0,license_mayaT=5,license_mayaU=5
> >>>>>>>>>>
> >>>>>>>>>> If I submit a job using:
> >>>>>>>>>>   qrsh -V -now n -q qapp.q -l
> >>>> hostname=$HOSTNAME,license_mayaT=5
> >>>>>>> -b
> >>>>>>>>>
> >>>>>>>>>> y glxgears
> >>>>>>>>>>
> >>>>>>>>>> Checking with "qconf -se" I still have license_mayaT=5
> >>>>>>> afterwards.
> >>>>>>>>>
> >>>>>>>>>> It's not being "consumed". Have I missed a config option
> >>>>>>> somewhere?
> >>>>>>>>>
> >>>>>>>>>> If I request license_mayaT=6 the job doesn't run because I
> >>>>>>> haven't
> >>>>>>>>>
> >>>>>>>>>> got enough licenses which suggests SGE is recognising the
> >>>> complex
> >>>>>>>>>> just not consuming it while the job is running. I am using
> >> SGE
> >>>>>>> 6.1
> >>>>>>>>>
> >>>>>>>>>> on a mix of x86 and x86_64 servers/clients. This used to
> >> work
> >>>> for
> >>>>>>>>>> me last time I installed and configured SGE - have I found
> a
> >>>> bug?
> >>>>>>>>>
> >>>>>>>>> what is:
> >>>>>>>>>
> >>>>>>>>> qhost -F license_mayaT
> >>>>>>>>>
> >>>>>>>>> showing?
> >>>>>>>>>
> >>>>>>>>> -- Reuti
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
> For additional commands, e-mail: users-help at gridengine.sunsource.net

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
For additional commands, e-mail: users-help at gridengine.sunsource.net




More information about the gridengine-users mailing list