[GE users] Strang SGE behavior

dom marco.donauer at sun.com
Thu May 14 16:17:12 BST 2009


You can hava a look into the qmaster spooling directory and look for a
jobs directory.
This directory contains the spooledjobs. You can look for the job-id of
the job which is not deletable.
Remove this dir and it should work again.
Who did this problem appear. Did you have any problems with your hosts,
or network?

Marco

allantran wrote:
> Thanks for your response, Marco.
> I'm using classic spooling. Is there way to remove the that old job
> object. Everything else seems working fine so I hesitate to reinstall
> the qmaster.
> Any input would be appreciated.
> Allan
>
> On Wed, May 13, 2009 at 10:46 PM, dom <marco.donauer at sun.com
> <mailto:marco.donauer at sun.com>> wrote:
>
>     Hi,
>
>     what kind of spooling do you use and what is you sge version?
>     It looks like any old job object is spooled, which is somehow
>     broken and
>     the qmaster is not able to remove it.
>
>     Marco
>
>
>     allantran wrote:
>     > I notice that it's not rebooting but everytime sgemaster restarted,
>     > the old job stuck back into the queue and stay in "t"state. Anyone
>     > know how to remove it permanently so it wont come back? No
>     matter how
>     > many time I qdel it, it goes away until the machine reboots or
>     > sgemaster restarted.
>     > Thanks
>     >
>     >
>     > On Tue, May 12, 2009 at 3:09 PM, Allan Tran
>     <tran.v.allan at gmail.com <mailto:tran.v.allan at gmail.com>
>     > <mailto:tran.v.allan at gmail.com <mailto:tran.v.allan at gmail.com>>>
>     wrote:
>     >
>     >     Hi group,
>     >     I installed a new sge on a new cluster and everything seems
>     >     working however, every time I reboot the master node (has
>     qmaster
>     >     and sgeexecd running), there is an old job stuck back in the
>     queue
>     >     in "t" state. This causes all jobs submitted after that stays in
>     >     "qw" state and not able to run.
>     >     Anyone know why the old jobs put back in the queue? I even
>     deleted
>     >     this job twice before but it seems never gone away after reboot.
>     >     Thanks for the help
>     >
>     >
>
>     ------------------------------------------------------
>     http://gridengine.sunsource.net/ds/viewMessage.do?dsForumId=38&dsMessageId=195332
>     <http://gridengine.sunsource.net/ds/viewMessage.do?dsForumId=38&dsMessageId=195332>
>
>     To unsubscribe from this discussion, e-mail:
>     [users-unsubscribe at gridengine.sunsource.net
>     <mailto:users-unsubscribe at gridengine.sunsource.net>].
>
>

------------------------------------------------------
http://gridengine.sunsource.net/ds/viewMessage.do?dsForumId=38&dsMessageId=195564

To unsubscribe from this discussion, e-mail: [users-unsubscribe at gridengine.sunsource.net].



More information about the gridengine-users mailing list