[GE users] Please Review: Non-Multiplied Consumable Requests for Parallel Jobs

Reuti reuti at staff.uni-marburg.de
Thu Sep 25 13:55:56 BST 2008


Hi Andy,

Am 25.09.2008 um 13:34 schrieb Andy Schwierskott:

> Reuti,
>
>>> as Andy already mentioned in our Release Plan we intend change the
>>> multiplication by slots behavior for consumable requests with PE  
>>> jobs. The
>>> spec how we suggest to target this issue is attached. Any  
>>> feedback or
>>> comments are appreciated.
>>
>> this is great and sounds reasonable. It will cover a few RFEs in  
>> Issuzilla at
>> once.
>
> One RFE is not covered: exclusive node allocation. A host can accept
> multiple jobs to run, but a job requesting a host exclusively  
> cannot run
> when there are other jobs already running on that host. As soon as a
> exclusive job is running the host would not accept any other jobs.
>
> We've been discussing it, but time will be likely too limited to  
> implement
> this for Urubu. The rule would be as follows:
>
> E  = exlusive job
> NE = non exclusive job
>
> job  host status     action
> ------------------------------------
> NE   empty           schedule
> NE   NE running      schedule
> NE   E running       do not schedule
> E    empty           schedule
> E    NE running      do not schedule
> E    E running       do not schedule
>
> To say it in other words:
>
>   if (NE)
>      if (E running)
>         do not schedule
>      else
>         schedule
>   else
>      if (empty)
>         schedule
>      else
>         do not schedule
>
> In addition it should be possible to mark a host to accept  
> exclusive jobs
> only to avoid to setup a special queue for those hosts.
>
> In principle I think it should not be limited to one exclusive job  
> but up to
> N jobs. Is there a realistic use case for it for that scenario?
>
>> One small question: there was somewhere on the mailing list  
>> mentioning, that
>> his software needs one license per host, independed from the  
>> number of
>> processes/jobs running there (host-locked floating license). As  
>> far as I
>> understand the complex attached as HOST consumable right now, it's  
>> per job on
>> a node. So to cover his request we would need another type like  
>> HOSTLOCKED /
>> HOSTONCE or so? The total amount could be specified by setting it  
>> in global
>> then.
>
> Do you think you can find the RFE in issuezilla?
>
> Well, if you manage this license as follows:
>
> global_host:   complex_values host-locked-float-lic=1
> <hosts>:       complex_values host-locked-float-lic=1
>
> at most 1 job could run in the whole cluster requesting the
> host-locked-float-lic license. Or di I misunderstand the use case?

I fear so - and in fact there are two flavours as I now realize. Only  
one was entered as an RFE.

a)

http://gridengine.sunsource.net/servlets/ReadMsg?list=users&msgNo=23664

http://gridengine.sunsource.net/issues/show_bug.cgi?id=1276 (at the end)

I'm not sure, whether this should be covered by an RQS as suggested  
(also by myself at that time) or with an entry in the suggested  
complex extension by HOSTLOCKED (the latter would be more consequent,  
as it means to request a resource and not to submit into a special  
queue [which would be more the PBS-style])

b)

http://gridengine.sunsource.net/servlets/ReadMsg? 
listName=users&msgNo=11319

If a job is scheduled to the Mac nodes, they shouldn't use a floating  
license. Hence this would need an entry "JOB#HOST", which means JOB  
*or* HOST is considered, but not both for a job. The Mac nodes could  
simply get a count like cores installed (or the special value  
"infinite", although this was already discussed of making no sense  
for consumables in an RQS. Here it could in mean in addition "don't  
decrease the global count"). For the other nodes the floating  
licenses should be decreased from the global amount as usual.


-- Reuti

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
For additional commands, e-mail: users-help at gridengine.sunsource.net




More information about the gridengine-users mailing list