[GE users] How to remove "E" in the queue status

Yusuf Sun yusuf.sun at gmail.com
Fri Jun 16 13:04:19 BST 2006


    [ The following text is in the "ISO-8859-1" character set. ]
    [ Your display is set for the "ISO-8859-10" character set.  ]
    [ Some special characters may be displayed incorrectly. ]

Thanks!
I'll see if it'll come back again.
Not sure what's the cause of this error yet.

Y.Sun

On 6/16/06, Chris Dagdigian <dag at sonsorol.org> wrote:
>
>
> The "E" error state usually means that a job died in a spectacular
> manner (possibly taking down the sge_shepherd with it).
>
> SGE persists the E state until it is manually cleared, to prevent a
> "black hole" effect whereby all your pending jobs drain into a
> potentially "bad" machine and all exit quickly with some type of error.
>
> The first thing you should do is examine the cause for the E error.
> If this was a transient error or something that you do not think will
> repeat then you can clear the error state. It is not good to clear
> the E state if it is just going to come back again.
>
> The clear command is "qmod -c" and you can clear your whole cluster
> with " qmod -c '*'  "
>
> Regards,
> Chris
>
>
>
> On Jun 16, 2006, at 5:57 AM, Yusuf Sun wrote:
>
> > Dear SGE users,
> >
> > We installed SGE on a small cluster. Recently, "qstat -f" shows
> > one node is "E". I guess it means some error on this node.
> > I reboot this node and restart sge_execd on this node.
> > The "E" is still there. How to find this error and get rid of this
> > "E"?
> >
> > Thanks
> > Y.Sun
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
> For additional commands, e-mail: users-help at gridengine.sunsource.net
>
>



More information about the gridengine-users mailing list