[GE users] consumable license complexes and preemption

sgenedharvey sge at nedharvey.com
Wed Nov 17 14:12:30 GMT 2010


> From: reuti [mailto:reuti at staff.uni-marburg.de]
> 
> maybe a forced advance reservation would do. The
> resource reservation suspends the running job(s), but doesn't consume

An advance reservation ... not a bad idea ...

I talked about complexities of resuming a suspended job.  Making sure it
will only resume at the right time ... not before its priority, and not
waiting for things of lower priority etc.

Perhaps if you cloned the resource requirements of the suspended job, and
submitted a job just like it, which is only a "resume" job.  So the job in
queue will obtain the necessary resources at the right time, and the
suspended job resumes, and the "resume" job disappears.  Perfect.

The only problem I can think of is ... 

Suppose you've got a medium-priority job running on SystemX, which consumes
ResourceA and ResourceB.  You suspend it in order to make way for a high
priority job.  You create a medium-priority "resume" job which requires a
slot on SystemX, one ResourceA, and one ResourceB.  You queue up a million
low-priority jobs that require ResourceA, and another million low-priority
jobs that require ResourceB.  Now the problem is ... the "resume" cannot
happen until there is a coincidence, SystemX, ResourceA, and ResourceB must
all be available at some particular dispatch interval.  If these resources
are freed up one at a time ... then the low priority jobs will continually
keep grabbing whichever one is available, and preventing the coincidence of
all three... thus preventing the medium priority resume from taking place.

But let's not merge two separate problems.  Even today, if you have a medium
priority job requiring 2 different resources, and a million low-priority
jobs which only consume one ... the low priority jobs will also prevent the
med pri jobs from running.  So that issue is really separate and independent
from the idea of resuming jobs.  It's already present; it always has been; I
don't hear anybody complaining about it.

But I still agree, the idea of "reserving" resources is a good idea.
Although nobody's complaining about the above resource contention problem
yet ... suspending & resuming jobs could be the ingredient which tips the
scales.

------------------------------------------------------
http://gridengine.sunsource.net/ds/viewMessage.do?dsForumId=38&dsMessageId=296402

To unsubscribe from this discussion, e-mail: [users-unsubscribe at gridengine.sunsource.net].



More information about the gridengine-users mailing list