[GE users] stupid software queues for license, was: weird "orders queue version is not uptodate" messages in qmaster log

Xavier MACHENAUD xavier.machenaud at st.com
Fri Mar 11 08:40:37 GMT 2005


    [ The following text is in the "ISO-8859-1" character set. ]
    [ Your display is set for the "ISO-8859-10" character set.  ]
    [ Some special characters may be displayed incorrectly. ]

I was able to progress a little bit with my problem.

Multiple calls to "qconf -mattr exechost complex_values lic_XXX=<value>" 
in a short amount of time are producing the qmaster messages (orders 
queue version is not uptodate) and lead, after a while, to the blocked 
situation where no jobs are dispatched anymore.

I made the following changes :
   * use one script to set all lic_XXX complexes instead of multiple 
scripts (one per vendor) to avoid multiple parallel calls to set complex 
values.
   * add a sleep between 2 calls to set complex values. My guess is it's 
giving time for any triggered event to be dispatched.

Before the changes, "orders queue version is not uptodate" messages were 
added in qmaster log every minute.
After the changes, the messages is added only every other hours. I so 
far, my grid hasn't been blocked.

About your idea of using Flexlm reservations, I used it but my idea was 
to reserve a minimum set of licences and to be able to consume more if 
there are few availables. But, Flexlm doesn't handle properly 
reservations of queuable licenses if you are consuming more than the 
amount of reserved licences. It such a case, licence requests queued 
will ALWAYS request non-reserved licences. I just love Flexlm :-(

Regarding the consumable+wrapper idea, the complexe is already 
consumable and I can't enforce running all licences requests thru SGE. 
SGE in containing the production tools available to the company 
designers, while lots of developers needs to have access to licences 
from their environment (which is usualy not available yet in the 
production environment).

Xavier

Magnus Söderberg wrote:

> Xavier MACHENAUD wrote:
>
>> There are few reasons I proceed this way :
>>  * The code which is calling the licensed tool is embeded in in house 
>> developped code which has been thru a validication cycle and I don't 
>> want to change the code.
>>  * the code is also part of a makefile and thus, submitted using 
>> qrsh. As far as I known, qrsh doesn't rerun jobs doing an exit 99.
>>
>> Xavier
>>
> Hmmm, I see your point.
> I still think you will have a hard time fixing it with SGE only though.
> Flexlm allows you to reserve licenses for particular users or groups 
> of users. If you have a reasonable amount of licenses, reserving some 
> for SGE and some for non-SGE work perhaps could work. The you could 
> tie a consumable complex to the SGE-reserved ones. Judging from the 
> domain in your email you ought to have lots of licenses....
> Or perhaps changing that complex into a consumable and make sure that 
> all requests for that particular software is run thorugh SGE, perhaps 
> through a wrapper script.
>
> /Magnus
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
> For additional commands, e-mail: users-help at gridengine.sunsource.net
>



---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
For additional commands, e-mail: users-help at gridengine.sunsource.net




More information about the gridengine-users mailing list