[GE users] restart job

Shannon V. Davidson svdavidson at swbell.net
Mon Nov 22 20:21:13 GMT 2004


John,

Take a look at the reschedule_unknown parameter in the sge_conf(5) man page.

Shannon


John Sikorski wrote:

> I'm using SGE 5.3 and would like to configure it to restart jobs when 
> an execution host goes down.  I'm not concerned about starting where 
> the job left off so restarting from the beginning is fine.  On my 
> current installation, if a node goes down a job running on that node 
> will restart only when the node itself restarts.  Is it possible to 
> have the master sense when a node goes down and reschedule any running 
> jobs?  This would have to work for planned outages and unplanned 
> outages, like when a node crashes.  I tried using the '-r y' switch in 
> qsub and setting up checkpointing but nothing seemed to matter.
>
>  
>
> Thanks
>
>  
>
> John
>
>  
>


-- 
___________________________________________

Shannon V. Davidson <svdavidson at swbell.net>
Senior Software Engineer           Raytheon
636-479-7465 office        443-383-0331 fax
___________________________________________





More information about the gridengine-users mailing list