[GE users] Fwd: subnode with empty slots but jobs in queue

jlforrest jlforrest at berkeley.edu
Mon Dec 6 19:29:39 GMT 2010


On 12/6/2010 11:02 AM, reuti wrote:

> When the local spool directory exists after the reboot, the
> execd would inform the qmaster about the failed jobs. When there is
> no information on the node about the last running jobs, the execd
> won't tell anything to the qmaster, and on its own it's waiting for
> the jobs to reappear.

I was thinking about this. I wonder if this
is the right thing to do. If the actual
contents of the local spool directory is
empty, or different than what the qmaster
expects, then what point is there for the
qmaster to continue to think that the
jobs exist, or will ever come back?
In other words, shouldn't the contents
of the local spool directory determine
the qmaster's conception of reality?

-- 
Jon Forrest
Research Computing Support
College of Chemistry
173 Tan Hall
University of California Berkeley
Berkeley, CA
94720-1460
510-643-1032
jlforrest at berkeley.edu

------------------------------------------------------
http://gridengine.sunsource.net/ds/viewMessage.do?dsForumId=38&dsMessageId=302559

To unsubscribe from this discussion, e-mail: [users-unsubscribe at gridengine.sunsource.net].



More information about the gridengine-users mailing list