[GE users] powered off nodes and SGE

kisielk kamil at kamilkisiel.net
Thu Aug 12 19:04:52 BST 2010

> > 2) when a node is powered off , scheduler ignore that node or still schedule jobs on that node ? 
> No, it won't schedule any job to it. OTOH, when a node shuts down while it's running a job, SGE can be configured to reschedule the job to a different node to restart from the beginning.
> -- Reuti

Can you give an example of how to set this up? 

I've had problems getting SGE to handle jobs on nodes that go down. The queues go in to the "au" state but the job remains there.  Sometimes after the node reboots, SGE still reports the jobs as running in those queues even though they clearly are not. 

My nodes use tmpfs mounted spool directories, so the spool is empty after they reboot. Perhaps this has something to do with it?

In any case, what I would like to happen is when a node becomes unavailable for the jobs on it to either fail or be restarted on another node if they are eligible.


To unsubscribe from this discussion, e-mail: [users-unsubscribe at gridengine.sunsource.net].

More information about the gridengine-users mailing list