[GE users] reports running job (xxxx.xx/master) in queue "xyz.q at myhost" that was not supposed to be there

reuti reuti at staff.uni-marburg.de
Fri Aug 27 17:43:35 BST 2010


Hi,

Am 27.08.2010 um 17:16 schrieb l_heck:

> Some time ago, one user flooded the system with several 10s of thousands of jobs 
> and brought the batch server to its knees. I tried to qdel but without success. 
> In the end, in desparation, I deleted the jobs scripts. which did the trick. 
> However There are two legacy ghost jobs for which I get the error messages f the 
> kind
> 
> reports running job (123456.16/master) in queue "xyz.q at myhost" that was not 
> supposed to be there
> 
> I tried to kill it, but the job does not exist anymore. Is there a way to tidy 
> this up?
> 
> I have restarted the server etc, but this information is kept in the qmaster's 
> memory beyond it being restarted.

please check the spool directory of the node:

/var/spool/sge/<node>/jobs

or

$SGE_ROOT/default/spool/<node>/jobs

the complete 00/0000/... hierarchy therein can be removed (when the node is empty). I would assume, that for some jobs still entries exist.

-- Reuti


> Lydia Heck
> 
> ------------------------------------------------------
> http://gridengine.sunsource.net/ds/viewMessage.do?dsForumId=38&dsMessageId=277458
> 
> To unsubscribe from this discussion, e-mail: [users-unsubscribe at gridengine.sunsource.net].

------------------------------------------------------
http://gridengine.sunsource.net/ds/viewMessage.do?dsForumId=38&dsMessageId=277488

To unsubscribe from this discussion, e-mail: [users-unsubscribe at gridengine.sunsource.net].



More information about the gridengine-users mailing list