[GE users] SGE error ?

In-Saeng Suh isuh at nd.edu
Tue Oct 26 19:47:34 BST 2004


Hi Reuti,

Thanks for your reply.

0. No error in the output file and nothing in pe output file.

1. Nothing message related this job in /qmaster/messages

2. Nothing message related this job in /<nodename>/messages

3. Thanks I will find Howto.

Thanks,

On Tue, 26 Oct 2004, Reuti wrote:

> > I have a some weird problem when I submit a job to our linux cluster.
> > We are using Grid Engine Enterprise Edition 5.3p2 version.
> > When I submit a parallel job using MPICH, SGE correctly asigns and
> > transfers the job to the right nodes. But SGE terminates as after the job
> > begins and job running status on "qstat" disappears.
> > But the job is still running on the nodes implicitly and just "qstat"
> > give an information for "load_avg" number in each node.
> >
> > Have any idea ?
>
> 0. Anything in any output file of the job or the PE?
>
> 1. Check what is in the messages file on the qmaster:
> $SGE_ROOT/default/spool/qmaster/messages
>
> 2. The same for the assigned masternode:
> $SGE_ROOT/default/spool/<nodename>/messages
>
> 3. Read the appropiate Howto at the sunorce.net site for tight integration.
>
> Cheers - Reuti
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
> For additional commands, e-mail: users-help at gridengine.sunsource.net
>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
For additional commands, e-mail: users-help at gridengine.sunsource.net




More information about the gridengine-users mailing list