[GE users] Jobs remaining in d state

Ron Chen ron_chen_123 at yahoo.com
Mon May 8 12:01:08 BST 2006


    [ The following text is in the "iso-8859-1" character set. ]
    [ Your display is set for the "ISO-8859-10" character set.  ]
    [ Some special characters may be displayed incorrectly. ]

If the node the job runs on is not reachable by qmaster, then
you will encounter that. You can use "qdel -f" to force a
cleanup.

 -Ron


--- Jean-Paul Minet <minet at cism.ucl.ac.be> wrote:
> Hi,
> 
> Regularly, I see jobs deleted by users (qdel) remaining in the
> d state.  For 
> example, I have in the qmaster message file:
> 
> 05/05/2006 14:12:55|qmaster|lmsp|I|hermet has registered the
> job 11025 for deletion
> 
> and three days later, qstat shows
> 
> 11025 0.00581 run.para hermet  dr 05/05/2006 09:40:43
> all.q at lmexec-82 
> 
> 
> There is no user process left running on the mpich head/master
> node nor on 
> child/slave nodes.  On the head node, the rsh link and machine
> file generated by 
> the startmpi.sh script have been removed from the
> /tmp/11025.1.all.q directory, 
> but a qrsh_client_cache file remains there.
> 
> Any clue of where to look for additional info (what prevents
> SGE from completing 
> job deletion) ?
> 
> Thanks
> 
> Jean-Paul
> 
>
---------------------------------------------------------------------
> To unsubscribe, e-mail:
> users-unsubscribe at gridengine.sunsource.net
> For additional commands, e-mail:
> users-help at gridengine.sunsource.net
> 
> 


__________________________________________________
Do You Yahoo!?
Tired of spam?  Yahoo! Mail has the best spam protection around 
http://mail.yahoo.com 

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
For additional commands, e-mail: users-help at gridengine.sunsource.net




More information about the gridengine-users mailing list