[GE users] FW: my sge system is not working with the fault tolerance

reuti reuti at staff.uni-marburg.de
Mon Mar 2 10:45:11 GMT 2009


Am 02.03.2009 um 00:53 schrieb tamara:

> you submit the jobs as being rerunable? Either with "qsub -r y" or by
> setting it in the queue configuration?
> -- Reuti
> thanks to you now the jobs are being rescheduled, but still i have  
> two questions
> 1- how  to make it rerunable from the queue configuration so  
> whenever i submit jobs i don't have to write the command qsub -r y?

you could either define it in $SGE_ROOT/default/common/sge_request or  
in the queue definition in the entry "rerun                 TRUE".

> 2- when the exec is disconnected the job is not rescheduled  
> automatically i have to press reschedule bottom to make it happen,  
> so how can i reschedule it automatically?

This should work with the above setting and proper entries for  
reschedule_unknown in SGE's configuration.

> also where can i find all these information?

This is a third question ;-)

man sge_conf (also -r y is mentioned there)

-- Reuti


To unsubscribe from this discussion, e-mail: [users-unsubscribe at gridengine.sunsource.net].

More information about the gridengine-users mailing list