[GE users] controlling jobs on failed nodes

snosov serge.nosov2 at gmail.com
Thu Aug 13 00:10:06 BST 2009


    [ The following text is in the "iso-8859-1" character set. ]
    [ Your display is set for the "ISO-8859-10" character set.  ]
    [ Some special characters may be displayed incorrectly. ]

Hi,

I was wondering if there was a way to make GE terminate/reschedule a job, if a node that this job was running on does not respond for a specified period of time. Currently, the setup that I have with 6.1u5 is that if a node goes down while in the middle of running a job, this job stays in "running" state forever.

Thank you,
Serge.



More information about the gridengine-users mailing list