AW: [GE users] resource allocation and race condition

Melvin Koh melvin at apstc.sun.com.sg
Sat Oct 16 13:50:06 BST 2004


On Fri, 15 Oct 2004 Rod.Rebello at Microchip.com wrote:

> Although I can think of one situation where using a load sensor may be 
> useful for flexlm license tracking , can you use a hard counter for the 
> number of flexlm licenses?
> 
> For example, in our situation, we just set a global consumable counter to 
> the number of flex licenses we have available and let SGE count up/down as 
> resource requests are made.   Never had a race condition with this method. 
>   Only drawback I see is that you now have to remember to update the 
> counter resource whenever new licenses are added instead of automatically 
> getting the info from flexlm.

You also have to make sure that the application can only be execute
through SGE. If not, the real license count and consumable counter gets
out of sync.

This issue has been heavily discussed before, and as long as you're using
loadsensor, the race condition will always exist.

> The only problem we've had with this is when multiple grids all are 
> accessing the same set of flexlm licenses.  This is where a load sensor 
> may help to keep all the grids in sync.
> 
> Just curious, but is there any other reason why you need to use a load 
> sensor for license tracking?
> 
>         To:     users at gridengine.sunsource.net
>         cc: 
>         Subject:        Re: AW: [GE users] resource allocation and race condition
> 
> Hello,
> 
> Please see this HOWTO on tracking licenses with GE:
> http://bioteam.net/dag/sge-flexlm-integration/
> 
> The HOWTO has all the details, but basically, you track a license with 
> *both* a load sensor *and* a consumable resource simultaneously.  The 
> GE master will then use whichever is the lower of the two values in 
> order to avoid oversubscribing a license.  The HOWTO talks about how 
> there's still the possibility of a race condition, and ways to deal 
> with it.
> 
> Regards,
>                  Charu
> 
> On Oct 15, 2004, at 8:19 AM, Olesen, Mark wrote:
> 
> >>> Assuming that I only have a single float license 'foo', I can
> >>> 'qsub -l foo=1' a job.  After a while I submit two (2) new jobs with 
> >>> the
> >>> same resource requirement(s). Both these jobs wait politely in the
> >> queue,
> >>> since the resource 'foo' is unavailable.  After the first job 
> >>> finishes,
> >> and
> >>> the load reports get correctly updated, *both* of the jobs in the 
> >>> queue
> >> try
> >>> to grab the 'foo' resource (almost) simultaneously.
> >>> How can I circumvent such a race condition?
> >>
> >> Could you use a SGE consumable in addition to your load sensor? - 
> >> Reuti
> >
> >
> > Based on what I can read from host_conf(5) about 'complex_values', I'd 
> > have
> > to alter the load sensor so that it only tracks non-SGE license use 
> > rather
> > than reporting the number of licenses currently available for use.
> >
> > This means that the load sensor needs to distinguish between 
> > applications
> > that were started with/without SGE. If accomplished, this would make 
> > the
> > load sensor anything other than lightweight.
> >
> > Is there a direct way, or a backdoor, to determine how many resources 
> > SGE
> > believes are still free and/or have been allocated?  Perhaps this 
> > could be a
> > means of adjusting the load sensor values.
> >

-- 
-----------------------------------------
Melvin Koh Chee Kian
Grid Research Engineer
Asia Pacific Science & Technology Center
Sun Microsystems Inc.
Website: http://apstc.sun.com.sg/
-----------------------------------------


---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
For additional commands, e-mail: users-help at gridengine.sunsource.net




More information about the gridengine-users mailing list