[GE users] one node keeps going into error state

Andreas Haas Andreas.Haas at Sun.COM
Tue Nov 23 17:05:12 GMT 2004


Ah ... actually I meant the administrator abort mail.
That one is more detailed than user mail.

Andreas

On Tue, 23 Nov 2004, David Mathog wrote:

>
>
> > On Mon, 22 Nov 2004, Chris Dagdigian wrote:
> >
> > > Hi David,
> > >
> > > Anything informative in the spool log files?
> > >
> > > /usr/SGE/default/spool/qmaster/messages
> > > /usr/SGE/default/spool/qmaster/schedd/messages
> > >
> > > And especially:
> > >
> > > /usr/SGE/default/spool/mendel/messages
> >
> > Or try user abort mail as described in "Trouble Shooting" HOWTO
> >
> >
> http://gridengine.sunsource.net/project/gridengine/howto/troubleshooting.html
> >
> >
> Pretty much the same thing as in the log files:
>
> Job 4347 caused action: All Queues on host "mendel" set to ERROR
> User        = safrun
> Queue       = testm
> Host        = mendel
> Start Time  = <unknown>
> End Time    = <unknown>
> failed before prolog:shepherd exited with exit status 7
> Shepherd pe_hostfile:
> mendel 1 testm UNDEFINED
>
> Thanks,
>
> David Mathog
> mathog at caltech.edu
> Manager, Sequence Analysis Facility, Biology Division, Caltech
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
> For additional commands, e-mail: users-help at gridengine.sunsource.net
>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
For additional commands, e-mail: users-help at gridengine.sunsource.net




More information about the gridengine-users mailing list