AW: [GE users] resource allocation and race condition

Rod.Rebello at Microchip.com Rod.Rebello at Microchip.com
Mon Oct 18 16:17:40 BST 2004


Right.  We've trained our users to be sensitive to running outside the 
queue :).  We've also set up scripts to put placeholders in SGE that 
adjust the counters for those that need to run interactive instead of 
batch mode.  We don't put purely interactive applications in SGE - those 
run on the users workstation instead of our compute farm.

----------------------------------------
Rod Rebello
Microchip Technology Inc.





Melvin Koh <melvin at apstc.sun.com.sg>
10/16/2004 05:50 AM
Please respond to users

 
        To:     users at gridengine.sunsource.net
        cc: 
        Subject:        Re: AW: [GE users] resource allocation and race condition



On Fri, 15 Oct 2004 Rod.Rebello at Microchip.com wrote:

> Although I can think of one situation where using a load sensor may be 
> useful for flexlm license tracking , can you use a hard counter for the 
> number of flexlm licenses?
> 
> For example, in our situation, we just set a global consumable counter 
to 
> the number of flex licenses we have available and let SGE count up/down 
as 
> resource requests are made.   Never had a race condition with this 
method. 
>   Only drawback I see is that you now have to remember to update the 
> counter resource whenever new licenses are added instead of 
automatically 
> getting the info from flexlm.

You also have to make sure that the application can only be execute
through SGE. If not, the real license count and consumable counter gets
out of sync.

This issue has been heavily discussed before, and as long as you're using
loadsensor, the race condition will always exist.

> The only problem we've had with this is when multiple grids all are 
> accessing the same set of flexlm licenses.  This is where a load sensor 
> may help to keep all the grids in sync.
> 
> Just curious, but is there any other reason why you need to use a load 
> sensor for license tracking?
> 
>         To:     users at gridengine.sunsource.net
>         cc: 
>         Subject:        Re: AW: [GE users] resource allocation and race 
condition
> 
> Hello,
> 
> Please see this HOWTO on tracking licenses with GE:
> http://bioteam.net/dag/sge-flexlm-integration/
> 
> The HOWTO has all the details, but basically, you track a license with 
> *both* a load sensor *and* a consumable resource simultaneously.  The 
> GE master will then use whichever is the lower of the two values in 
> order to avoid oversubscribing a license.  The HOWTO talks about how 
> there's still the possibility of a race condition, and ways to deal 
> with it.
> 
> Regards,
>                  Charu
> 
> On Oct 15, 2004, at 8:19 AM, Olesen, Mark wrote:
> 
> >>> Assuming that I only have a single float license 'foo', I can
> >>> 'qsub -l foo=1' a job.  After a while I submit two (2) new jobs with 

> >>> the
> >>> same resource requirement(s). Both these jobs wait politely in the
> >> queue,
> >>> since the resource 'foo' is unavailable.  After the first job 
> >>> finishes,
> >> and
> >>> the load reports get correctly updated, *both* of the jobs in the 
> >>> queue
> >> try
> >>> to grab the 'foo' resource (almost) simultaneously.
> >>> How can I circumvent such a race condition?
> >>
> >> Could you use a SGE consumable in addition to your load sensor? - 
> >> Reuti
> >
> >
> > Based on what I can read from host_conf(5) about 'complex_values', I'd 

> > have
> > to alter the load sensor so that it only tracks non-SGE license use 
> > rather
> > than reporting the number of licenses currently available for use.
> >
> > This means that the load sensor needs to distinguish between 
> > applications
> > that were started with/without SGE. If accomplished, this would make 
> > the
> > load sensor anything other than lightweight.
> >
> > Is there a direct way, or a backdoor, to determine how many resources 
> > SGE
> > believes are still free and/or have been allocated?  Perhaps this 
> > could be a
> > means of adjusting the load sensor values.
> >

-- 
-----------------------------------------
Melvin Koh Chee Kian
Grid Research Engineer
Asia Pacific Science & Technology Center
Sun Microsystems Inc.
Website: http://apstc.sun.com.sg/
-----------------------------------------


---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
For additional commands, e-mail: users-help at gridengine.sunsource.net






More information about the gridengine-users mailing list