[GE users] Resource Reservation Issue

Reuti reuti at staff.uni-marburg.de
Tue Sep 16 12:52:37 BST 2008


Hi Mat,

Am 14.09.2008 um 12:21 schrieb Bradford, Matthew:

> Reuti,
>
> No, we used qalter once the jobs were running, we manually reduced  
> there
> h_rt time to see whether this changes  which nodes the scheduler  
> thinks
> will become available first, and therefore use these in the  
> reservation
> for the reserving job.
>
> I think my understanding of the use of h_rt was wrong. If I now
> understand correctly, it is part of the request stating that the job
> requires this much run time rather than a attribute of the job itself.
> The scheduler will then use the h_rt value to select the appropriate
> queue. Once the job has started, I assume that altering this value has
> no effect.

exactly. This is also set of an ulimit for the process, hence it  
can't be changed later on.

> Maybe that isn't the best way of testing this. We have tried a similar
> test, with a cluster full of jobs, and a pending job requesting  
> resource
> reservation of 4 slots. Even killing some of the running jobs in the
> cluster to free up 4 slots, the scheduler still doesn't alter the
> reservation to use them, and if jobs lower down in the pending  
> queue are
> present, they can jump in and start using the newly available slots.
>
> Basically, we have users complaining that their jobs, which are  
> sitting
> at the top of the pending queue, with reservations set, are not the  
> next
> jobs to execute.

Jobs with reservation are not by default considered to be the most  
important ones - they are just in the queue like other jobs. When  
they are the next ones to be executed, the reservation should start.  
You might want to add a complex with a set urgency for this type of  
jobs to push them up in the waiting list.

-- Reuti


> Any thoughts would be most helpful.
>
> Cheers,
>
> Mat
>
>
>> -----Original Message-----
>> From: Reuti [mailto:reuti at staff.uni-marburg.de]
>> Sent: 12 September 2008 23:02
>> To: users at gridengine.sunsource.net
>> Subject: Re: [GE users] Resource Reservation Issue
>>
>> Hi,
>>
>> Am 11.09.2008 um 13:21 schrieb Bradford, Matthew:
>>
>>> We are running SGE 6.0u8 and have a problem with how resource
>>> reservation works.
>>>
>>> For example:
>>>
>>> We have a full cluster running mainly parallel jobs of various sizes
>>> from 1 node to 16 nodes.
>>> We allow only 2 jobs to be run with a Reserve flag on them.
>>> Users aren't specifying an h_rt, and the default runtime is
>> set to 480
>>> hours.
>>> Occasionally, we set important jobs with a reserve flag to prevent
>>> resource starvation, and we push the job to the top of the pending
>>> queue.
>>>
>>> What we appear to get when we switch on monitoring is the scheduler
>>> selecting the nodes for the reserved job, but then not amending that
>>> selection even when there are free nodes in the system.
>>>
>>> During testing of this we noticed no difference when we
>> explicitly set
>>> the h_rt to 480 hours for all jobs, and then reduce the h_rt for
>>> specific jobs. We thought that the scheduler would recalculate and
>>> select nodes where it knows that jobs are nearly finished.
>>>
>> what do you mean by "reducing" - the waiting jobs? With qalter
>> while they are waiting?
>>
>> -- Reuti
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
>> For additional commands, e-mail: users-help at gridengine.sunsource.net
>>
>>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
> For additional commands, e-mail: users-help at gridengine.sunsource.net
>


---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
For additional commands, e-mail: users-help at gridengine.sunsource.net




More information about the gridengine-users mailing list