[GE users] SGE6.0u3 global consumable resource - applies to all queues

Reuti reuti at staff.uni-marburg.de
Wed Feb 2 22:43:56 GMT 2005


    [ The following text is in the "ISO-8859-1" character set. ]
    [ Your display is set for the "ISO-8859-10" character set.  ]
    [ Some special characters may be displayed incorrectly. ]

Hello Walt,

with transfer-queue you mean a setup according to the appropriate Howto? I 
think this way you don't need a global resource at all. Just setup one queue 
for the local execution with the value of the resource set to 1 and e.g. a 
sequence number of 10 (the other local queues don't need it set to 0, because 
it's not global available). The transfer-queue will get a higher sequence 
number, a resource count of 5 (with enough slots in this queue), a 
load_tresholds for this resource which should be 5 and a modified load sensor, 
which will also track the remaining count of the resource of this license at 
the remote site (this remote queue will look at both resource restrictions: the 
one in the queue (which is setup with 5) and the one from the execution host 
(i.e. remote cluster) which will be set by the load sensor).

And this way you still will need only one resource for the complete setup and 
not two. Also the users can just request one resource and they are done. To 
force the remote execution for any reason, they can specify the queuename in 
addition to the resource request.

If the local queue is running out of licenses, the transfer-queue will be 
checked, and if a license is available (hence the queue not put into alarm 
state), the job will be transfered to the remote cluster.

Cheers - Reuti


Quoting Walt Minkel <wminkel at latticesemi.com>:

> Hi Reuti,
> 
> You did have it right.  Complex "run_licA" are the machines or in my case a
> machine with several transfer queues that defined where the job can run.
> 
> The site and wan licenses are on two different flexlm servers.  Jobs are 
> submitted
> to SGE and are dispatched to a transfer queue when all conditions are 
> correct
> (primarily, when there is a license available).  The transfer queue is 
> selected by
> "run_licA" (a better name might be "run_at_siteX").   When a license is 
> available,
> the job is transferred to siteX.  In my situation, the license in 
> questions is in high
> demand from multiple sites.  The challenge is to somehow have SGE 
> understand
> that a local site license is available before consuming a WAN license.
> 
> One solution I think will work is:  If I have 1 site license and 5 WAN 
> licenses...
> Make 6 license requestable. If a license is available, the job is put 
> into a transfer
> queue and a prolog (or the executable script), determines if the 
> available license
> is a site license and sets the appropriate environment variable.  I can 
> also use the
> number of cpu's to help control how many jobs are available to each 
> transfer queue.
> 
>     -Walt
> 
> Reuti wrote:
> 
> >Walt,
> >
> >so I got one point wrong: I thought "run_licA" are the machines, on which it
> 
> >could run. So you are looking for some kind of resource-staging. Although
> this 
> >is not directly implemented, in some way it could be simulated by setting up
> 
> >two cluster-queues and give an order by setting a sequence number. But
> first: 
> >how will you track the usage of the world wide license? Do you have a
> central 
> >machine for this somewhere around?
> >
> >Quoting Walt Minkel <wminkel at latticesemi.com>:
> >
> >  
> >
> >>Hi Reuti,
> >>
> >>Thank you for deepening my understanding of the global consumables.
> >>
> >>Your description of what I am trying to do is correct.  To fine tune my 
> >>need,
> >>I should add:   In addition to licenses being consumed for different 
> >>tools, in one
> >>tool's case, we have world wide WAN licenses as well as site licenses. 
> >> My goal was
> >>to use queue sequencing to look first at the site license.  For example, 
> >>A user could queue
> >>in to "run_licA" without needing to specify a the consumable license 
> >>"licA_us".
> >>
> >>Suggestions are welcome but based on your comments, I think I will look
> for
> >>any available license (site+WAN)  and have my execution script sort out 
> >>which
> >>to use.
> >>
> >>     -Walt
> >>
> >>Reuti wrote:
> >>
> >>    
> >>
> >>>Hi there,
> >>>
> >>>what you observe is the correct behavior of SGE. The complex "licA_us"
> isn't
> >>>      
> >>>
> >>>requestable, and so it is attached to all jobs you submit to SGE. You
> have
> >>>      
> >>>
> >>two 
> >>    
> >>
> >>>in total, so you can only run two jobs in the whole cluster of all type
> of
> >>>      
> >>>
> >>>jobs. Where is the connection between "licA_us" and "run_licA" for now?
> >>>
> >>>If I understand you in the right way, you want one the one hand a limit
> >>>      
> >>>
> >>global 
> >>    
> >>
> >>>which is two, and specify at the same time the nodes, on which this type
> of
> >>>      
> >>>
> >>job 
> >>    
> >>
> >>>may run at all.
> >>>
> >>>You can achieve this when you make the complex requestable, then the
> user
> >>>      
> >>>
> >>can 
> >>    
> >>
> >>>request it for this type of job. But with the current setup of a second
> >>>      
> >>>
> >>complex 
> >>    
> >>
> >>>"run_licA", you would have to request both resourcesto get the desired 
> >>>behavior.
> >>>
> >>>1. way)
> >>>
> >>>It's easier, when you disregard "run_licA" completely, and attach the
> >>>      
> >>>
> >>"licA_us" 
> >>    
> >>
> >>>also to the nodes on which the jobs may run, set to the number of CPUs
> in
> >>>      
> >>>
> >>this 
> >>    
> >>
> >>>machines. The request of the job has to fulfill both restrictions.
> >>>
> >>>      
> >>>
> >
> >Here I have to correct myself: for nodes, not eligible for this type of job,
> it 
> >must also by defined and set to 0.
> >
> >Cheers - Reuti
> >
> >  
> >
> >>>2. way)
> >>>
> >>>Make one cluster queue for this type of job and setup a hostgroup with
> the
> >>>      
> >>>
> >>>eligible machines. For this queue set "licA_us" to the number of CPUs,
> and
> >>>      
> >>>
> >>in 
> >>    
> >>
> >>>all other queues set it to 0. Also here, no "run_licA" is needed.
> >>>
> >>>In both cases, the user has to specify only the request of the resource 
> >>>"licA_us".
> >>>
> >>>Cheers - Reuti
> >>>
> >>>Quoting Walt Minkel <wminkel at latticesemi.com>:
> >>>
> >>> 
> >>>
> >>>      
> >>>
> >>>>Hi All,
> >>>>
> >>>>I am trying to use global consumable resources to manage a variety of 
> >>>>licenses using
> >>>>SGE6.0u3.  I'm not seeing the behavior I expect...  As soon as I modify
> 
> >>>>the global execution
> >>>>host consumable form in qmon, every queue in my  grid is forced to find
> 
> >>>>that resource
> >>>>available before a job is run.  In some cases  I only have  two 
> >>>>licenses.  Defining a global
> >>>>resource for these two licenses, limits the total  number of jobs 
> >>>>capable of running at one
> >>>>time to two for the entire grid.
> >>>>
> >>>>My complex looks like this:
> >>>>#name               shortcut          type        relop requestable 
> >>>>consumable default  urgenc
> >>>>lic_bv_us           licA_us         INT         <=         NO          
> 
> >>>>YES           1         0
> >>>>
> >>>>I am using the following to execute my script:
> >>>>
> >>>>qsub  -l  run_licA=1 myScript.csh                 run_licA is a complex
> 
> >>>>I use to direct where the job can
> >>>>                                                                      
> >>>>  run.
> >>>>
> >>>>Queuing more than  two jobs (total for all queues), the pending jobs 
> >>>>show this message:
> >>>>
> >>>>       (-l run_licA)  cannot run globally because for default request 
> >>>>it offers only gc:licA_us=0.0000
> >>>>
> >>>>Maybe I'm missing something, or my approach is wrong.  Any suggestions 
> >>>>would be greatly
> >>>>appreciated.
> >>>>
> >>>>Thanks,
> >>>> Walt
> >>>>
> >>>>
> >>>>---------------------------------------------------------------------
> >>>>To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
> >>>>For additional commands, e-mail: users-help at gridengine.sunsource.net
> >>>>
> >>>>   
> >>>>
> >>>>        
> >>>>
> >>>
> >>>---------------------------------------------------------------------
> >>>To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
> >>>For additional commands, e-mail: users-help at gridengine.sunsource.net
> >>>
> >>>
> >>>
> >>> 
> >>>
> >>>      
> >>>
> >>    
> >>
> >
> >
> >
> >---------------------------------------------------------------------
> >To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
> >For additional commands, e-mail: users-help at gridengine.sunsource.net
> >
> >
> >
> >  
> >
> 
> 



---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
For additional commands, e-mail: users-help at gridengine.sunsource.net




More information about the gridengine-users mailing list