[GE users] Specifying max retry on a job?

Reuti reuti at staff.uni-marburg.de
Sat Aug 26 11:23:34 BST 2006


Am 26.08.2006 um 10:30 schrieb Peter Bowmar:

> Hi,
>       Short time listener, first time poster :) I'm using SGE 6 to
> render frames of animations, it works very well for that. I'm curious,
> since I can't find it mentioned in the docs or the Qmon interface, how
> do I specify a max limit on the number of times a job is re-run?
>       If I get a bad frame for some reason that exits with error code
> greater than 0, I exit with 99, which nicely triggers a re-run.
> However, in some cases, it will just keep crashing (bug in  
> renderer, or
> my shaders or whatever) and I don't want it to keep re-running,
> essentially plugging up the queue with failing tasks.

what about:

qalter -v COUNT=$value $JOB_ID

before the exit 99. The variable COUNT (or any you like) can be  
tested in
the script (nodes must be a submit host in this case).

>       Also, if there is a better source of docs than the ones shipped
> with SGE (they're bare-bones to say the least :) please point me at
> them. This list seems to be the definitive source of knowledge but I
> don't want to keep posting questions that have answers elsewhere.

You can use the SUN docs:


  - Reuti

To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
For additional commands, e-mail: users-help at gridengine.sunsource.net

More information about the gridengine-users mailing list