[GE users] one node keeps going into error state

David Mathog mathog at mendel.bio.caltech.edu
Tue Nov 23 16:28:38 GMT 2004


    [ The following text is in the "iso-8859-1" character set. ]
    [ Your display is set for the "ISO-8859-10" character set.  ]
    [ Some special characters may be displayed incorrectly. ]



> On Mon, 22 Nov 2004, Chris Dagdigian wrote:
> 
> > Hi David,
> >
> > Anything informative in the spool log files?
> >
> > /usr/SGE/default/spool/qmaster/messages
> > /usr/SGE/default/spool/qmaster/schedd/messages
> >
> > And especially:
> >
> > /usr/SGE/default/spool/mendel/messages
> 
> Or try user abort mail as described in "Trouble Shooting" HOWTO
> 
>   
http://gridengine.sunsource.net/project/gridengine/howto/troubleshooting.html
> 
>
Pretty much the same thing as in the log files:

Job 4347 caused action: All Queues on host "mendel" set to ERROR
User        = safrun
Queue       = testm
Host        = mendel
Start Time  = <unknown>
End Time    = <unknown>
failed before prolog:shepherd exited with exit status 7
Shepherd pe_hostfile:
mendel 1 testm UNDEFINED
 
Thanks,

David Mathog
mathog at caltech.edu
Manager, Sequence Analysis Facility, Biology Division, Caltech

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
For additional commands, e-mail: users-help at gridengine.sunsource.net




More information about the gridengine-users mailing list