[GE users] Restarting sge_execd does not clear hung job status
michael.coffman at avagotech.com
Thu Oct 7 14:10:55 BST 2010
[ The following text is in the "iso-8859-1" character set. ]
[ Your display is set for the "ISO-8859-10" character set. ]
[ Some special characters may be displayed incorrectly. ]
On Thu, Oct 7, 2010 at 3:45 AM, reuti <reuti at staff.uni-marburg.de<mailto:reuti at staff.uni-marburg.de>> wrote:
Am 07.10.2010 um 00:08 schrieb coffman:
> I recently moved from 6.0u8 to 6.2u5 and am noticing a different behavior that I could use some help with. On the previous version of grid we would occasionally have a grid system hang in such a way that it would need to be rebooted. When this happened the job info related to the job would be cleared from the scheduler.
> Version 6.2u5 does not behave the same way. The system running a particular job has been rebooted, so the job is definitly no longer running. When the system comes back up, sge_execd is started on the exechost. A qstat still shows the job as running on the host that was rebooted. Any clues as to why it does not get cleaned up?
is the (local) spool directory of the node removed when the node is rebooted?
Yes. All that is left is the following:
To unsubscribe from this discussion, e-mail: [users-unsubscribe at gridengine.sunsource.net<mailto:users-unsubscribe at gridengine.sunsource.net>].
More information about the gridengine-users