[GE users] Jobs stuck in delete status

rayson rayrayson at gmail.com
Tue Jun 9 17:33:31 BST 2009

As the cluster admin user, issue:

% qdel -f <job id>


On 6/9/09, seandavi <seandavi at gmail.com> wrote:
> I'm using 6.2 and have managed to get a couple of jobs stuck in "dr"
> status.  Both were parallel jobs running across multiple machines, but
> both appear to have the "master" task running on the same machine.  I
> have restarted the qmaster and the execd on the machine on which the
> jobs appear to have had the "master" task.  Here is what I have in the
> execd messages file:
> 06/09/2009 12:18:46|  main|pressa|I|controlled shutdown 6.2
> 06/09/2009 12:18:53|  main|pressa|I|starting up SGE 6.2 (lx24-amd64)
> 06/09/2009 12:18:53|  main|pressa|W|reaping job "28147" ptf complains:
> Job does not exist
> Any ideas as to what is going on or how to go further with diagnosing
> the problem.  The cluster has been up and running for months without
> problems.  The only new addition is openmpi integration; it turns out
> that one of the jobs stuck in "dr" status is an mpirun job.
> Thanks,
> Sean
> ------------------------------------------------------
> http://gridengine.sunsource.net/ds/viewMessage.do?dsForumId=38&dsMessageId=201328
> To unsubscribe from this discussion, e-mail: [users-unsubscribe at gridengine.sunsource.net].


To unsubscribe from this discussion, e-mail: [users-unsubscribe at gridengine.sunsource.net].

More information about the gridengine-users mailing list