[GE users] consumable license complexes and preemption
reuti at staff.uni-marburg.de
Wed Nov 17 13:15:35 GMT 2010
Am 16.11.2010 um 16:55 schrieb jtseng_sf:
> More refinement:
> A variation of the workaround would be to submit a dummy job that does
> not directly go into a master/subordinate queue.
> Rather the dummy job would just simply do a qmod suspension of the job.
> When the master job finishes, it would qmod resume the job.
> This would simplify the queue configuration and remove the one-to-one
> master/slave job mapping that would need to happen inside of SGE.
> (I'm not sure slotwise preemption would work here - it's not clear to me
> how to speficy which "slot" is choosen for preemption)
This would be like the `qforce` I suggested some time ago: free whatever resource is necessary to run this job *now*.
In your implementation, maybe a forced advance reservation would do. The resource reservation suspends the running job(s), but doesn't consume anything on its own besides reserving the resources. Then you can submit the job into this AR. To ease the things: I filled aleady some time ago an RFE to have a one-shot-AR.
Of course: for now you have the same problem like with a normal job, as the AR wouldn't be scheduled because of missing resources, but I find an AR nicer than a dummy job.
> A second workaround is to have an external load sensor do the suspen
> On 11/16/2010 7:43 AM, John Tseng wrote:
>> Hi Reuti, your comments are very valid.
>> As both sgenedharvey and yourself have pointed out, the preemption of
>> licenses (as oppose to machines) is not a "simple" operation.
>> The "dont count certain consumables when a job is suspended" patch is
>> only a building block to get to a "look-ahead" like feature.
>> In the past, I've artificially increased the number of licenses to allow
>> a "master" job to subordinate/premept a "slave" job.
>> This requires a lot of complexity to make sure that the "one" master job
>> preempts the one slave job and not to allow a rogue job through.
>> This can be done using per host per cpu slot "host queues" and using
>> queue thresholds to open/close specific host queues.
>> However, the complexity is enormous.
>> I'd like to simplify.
>> One workaround is to have a "dummy" job actually do the "slave"
>> subordination instead of the actual "master" job.
>> An external load sensor would determine when to allow
>> subordination/preemption and submit the dummy job.
>> Since the dummy job would cause the license to be freed, then the master
>> job can take the license.
>> The dummy job would quit after the master job has finished. The slave
>> job would then resume.
>> In this scenario, it is not necessary that the "master" job run on the
>> same machine as the "slave" job.
>> The only issue left is that SGE sees the "slave" job still consuming a
>> sgenedharvey points out the amount of complexity that can occur in
>> determining which "slave" job to subordinate, but that is left as an
>> exercise to the reader :)
>> If the "don't count certain consumables when a job is suspended" patch
>> is implemented, then the workaround is much more straightforward.
>> Future patches can build upon this.
>> Perhaps the scheduler can account for subordination like it does for
>> reservation - but I haven't reviewed the code nor do I understand the
>> code yet.
>> On 11/16/2010 3:47 AM, reuti wrote:
>>> Am 16.11.2010 um 00:15 schrieb jtseng_sf:
>>>> Hi Everyone, I'm thinking of patching 6.2u5 to allow certain
>>>> consumables to be NOT counted by SGE when the job is preempted.
>>> the problem is, that you will free (in your patch: ignore) the used consumables of the subordinated job *after* the new job was dispatched to a node, and as a result of this dispatch the to be preempted job gets suspended and the resource consumption will be ignored finally.
>>> The real solution would be some kind of look-ahead feature to suspend a job (although there is still no job running in the superordinated queue) to get resources back with your patch, and after collecting all the necessary resources for the superordinated job to dispatch it to the node(s).
>>> -- Reuti
> To unsubscribe from this discussion, e-mail: [users-unsubscribe at gridengine.sunsource.net].
To unsubscribe from this discussion, e-mail: [users-unsubscribe at gridengine.sunsource.net].
More information about the gridengine-users