[GE users] Can I stop backfilling?

Daniel Templeton Dan.Templeton at Sun.COM
Tue May 20 20:13:30 BST 2008


    [ The following text is in the "ISO-8859-1" character set. ]
    [ Your display is set for the "ISO-8859-10" character set.  ]
    [ Some special characters may be displayed incorrectly. ]

Depends on the size of your cluster and what your priorities are.

Daniel

Kevin Doman wrote:
> max_reservation = 30; Is this ....sensible?
>
>
> On Tue, May 20, 2008 at 10:25 AM, Reuti <reuti at staff.uni-marburg.de> wrote:
>   
>> Am 20.05.2008 um 17:15 schrieb Daniel Templeton:
>>
>>     
>>> If your jobs are starving, what you're seeing is not backfilling. :)  What
>>> version of SGE are you using?  There was (is?) a bug where the first RR job
>>> was ignored.  Submitting a second identical RR job, in that case, would then
>>> cause the scheduler to take notice and actually do the RR properly.
>>>
>>> By definition, backfilling cannot cause starvation, unless a backfilled
>>> job runs forever.
>>>       
>> Good point. What h_rt is requested by these short jobs? Otherwise the
>> default_duration will be taken (but not enforced) and this might lead to a
>> roll-over from one extending job (running longer than the estimated default
>> 10 minutes) to the next one and so onI fear.
>>
>>     
>>>  BTW, when you say your jobs run 15-20 minutes, are they setting sort or
>>> hard run time limits?  If not, what is your default_duration?
>>>
>>> Daniel
>>>
>>> Kevin Doman wrote:
>>>       
>>>> We have a very busy cluster that always have thousands of short jobs
>>>> (15-20 minutes) in queue. Occasionally, a user come in and submit a 20
>>>> processor parallel job with h_rt=100 hours. While reservation is
>>>> enabled (-R y) and priority set to 1024, we continue to experience job
>>>>         
>> max_reservation is also set up to a sensible value?
>>
>> -- Reuti
>>
>>
>>     
>>>> backfills which resulted in the same 'parallel job starvation' issue.
>>>>
>>>> Is it possible for me to stop backfilling altogether and let the
>>>> parallel jobs go first?
>>>>
>>>> ---------------------------------------------------------------------
>>>> To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
>>>> For additional commands, e-mail: users-help at gridengine.sunsource.net
>>>>
>>>>
>>>>         
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
>>> For additional commands, e-mail: users-help at gridengine.sunsource.net
>>>       
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
>> For additional commands, e-mail: users-help at gridengine.sunsource.net
>>
>>
>>     
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
> For additional commands, e-mail: users-help at gridengine.sunsource.net
>
>   

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
For additional commands, e-mail: users-help at gridengine.sunsource.net




More information about the gridengine-users mailing list