[GE users] Releasing requested consumable resources before job completion

Reuti reuti at staff.uni-marburg.de
Mon Sep 1 13:44:36 BST 2008


Am 30.08.2008 um 19:52 schrieb Dioktos:

> On Sat, Aug 30, 2008 at 5:10 PM, Reuti <reuti at staff.uni-marburg.de>  
> wrote:
>> Hi,
>>
>> Am 30.08.2008 um 16:45 schrieb Dioktos <dioktos at gmail.com>:
>>
>>> I'm trying to find a way to effectively request the use of a CPU  
>>> for a
>>> job and release it before the job terminates.
>>>
>>> We have two types of jobs: database and disk I/O heavy jobs with  
>>> very
>>> low CPU usage (type 1), and jobs that use 1 cpu heavily for a while,
>>> then drop down to low CPU usage for a time (type 2). What I want to
>>> control is the number of type 2 jobs running in the CPU-heavy stage.
>>> If a job could release requested consumable resources, then each  
>>> type
>>> 2 job would request 1 cpu_thing and release it when no longer  
>>> needed;
>>> each host could have as many cpu_things as it had CPUs, and  
>>> (roughly)
>>> twice as many slots as CPUs, and every host would be fully used,  
>>> both
>>> CPU- and I/O-wise. However, jobs can't release resources.
>>>
>>> load_short or custom load sensors such as idle CPU counts don't work
>>> for two reasons: first, initial distribution of jobs fills all slots
>>> regardless of load thresholds, and then hosts are "overloaded;"
>>> second, using suspend thresholds mean that hosts will accumulate
>>> suspended jobs that could have waited and been deployed on free  
>>> hosts.
>>>
>>> Is there an effective workaround to this problem?
>>
>> did you try to use load_adjustments in the scheduler configuration  
>> already?
>
> Ah- no. I missed that. I suppose that's generally what I'm looking for
> in this case. Thank you.
>
> ... After a bit of experimentation, I find that I need to use
> load_short (or something even faster) to get a decent response time.
>
> It's still not what I'm really after, though, because I want to be
> able to submit more low-CPU jobs and not have them bump the load
> adjustment.

But you could bump up any other complex, which you e.g. assign only  
to the jobs with the high resource request during the beginning.  
Hence there are virtually less available. The bumped up complex can  
then be used to put the queue for this type of jobs into "alarm  
state", hence no further jobs of this type would go to this node,  
until it's again above a certain value (how load_thresholds works  
[triggered by "below" or "above"], depends on the relation defined  
for a complex).

Normal type jobs won't be effected this way, as they run in a  
different queue and don't request thisd special resource.

-- Reuti


> Still, thanks for your help!
>
> Cheers,
> Dioktos
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
> For additional commands, e-mail: users-help at gridengine.sunsource.net


---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
For additional commands, e-mail: users-help at gridengine.sunsource.net




More information about the gridengine-users mailing list