[GE users] CPU time limit

Reuti reuti at staff.uni-marburg.de
Wed May 24 10:29:10 BST 2006

Am 24.05.2006 um 11:01 schrieb Rui Ramos:

>  Hi,
>  thanks on the quick reply
> On Wed, 24 May 2006 10:40:26 +0200
> Reuti <reuti at staff.uni-marburg.de> wrote:
>> Hi,
>> Am 24.05.2006 um 10:35 schrieb Rui Ramos:
>>>  Hi there,
>>>  I'm setting CPU time limits in a queue. But i don't want it to
>>> kill the job but suspend it. Is there a way to change the signal ?
>> if you would do this, there would never be the chance to unsuspend
>> the job by SGE, as from SGE's point of view the lifetime of the job
>> is already over.
>> What setup do you wish to have? Maybe a second subordinated queue,
>> which will suspend the jobs therein, would be the more appropriate
>> setup.
>  I wish to do the following.
>   all.q
>    |- short.q
>    |- medium.q
>    |- long.q
>  The short and medium have different CPU time limits.
>  If reached suspend it, and migrate the job to a lower queue.
>  Also set different priorities in each of the queues. (not done yet!)

To get this, you would need to use the checkpointing support in SGE.  
In this case, a suspend can trigger the reschedule of the job to a  
different queue. But you would need to define the new queue in a  
qalter statement before the job gets suspended, and you would also  
need some checkpointing in your application, as the job will  
otherwise run from the beginning again. SGE supports checkpointing if  
the application supports it, but does not provide any checkpointing  
library to be used by your application.

The feature, that a job in short.q gets suspended, moved to medium.q,  
and then gets unsuspended inside medium.q is not available.

In addition, a suspend after a certain time is not implemented. But  
it might be possible to get this, if you a) submit the job with - 
notify or b) use a soft-limit for the queue, so that it get warned  
before it will be killed and do some proper action in your jobscript  
on your own. There the job has to suspend itself, so that the above  
described scenario will work.

>  I've set the short.q medium.q and long.q as subordinated queues of
>  all.q is that what you sugested ?

The subordinate feature just suspends the queue with the jobs inside  
if the superordinated queue gets full, and it will continue (i.e.  
unsuspend) the queue if the superordinated queue gets empty again.

-- Reuti

>                                                            Regards
> -- 
> ============================================
>  Rui Manuel dos Santos Ramos
>  Instituto de Recursos e Iniciativas Comuns
>  Praca Gomes Teixeira, 4099-002 Porto, Portugal
>  phone : +351 223 401 571
>  e-mail: rramos[at]iric.up.pt
>     web: http://ruiramos.homeip.net
> ============================================

To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
For additional commands, e-mail: users-help at gridengine.sunsource.net

More information about the gridengine-users mailing list