[GE users] SGE error ?

Sean Dilda agrajag at dragaera.net
Wed Oct 27 21:07:27 BST 2004


On Tue, 2004-10-26 at 14:16, In-Saeng Suh wrote:
> Hi SGE users,
> 
> I have a some weird problem when I submit a job to our linux cluster.
> We are using Grid Engine Enterprise Edition 5.3p2 version.
> When I submit a parallel job using MPICH, SGE correctly asigns and
> transfers the job to the right nodes. But SGE terminates as after the job
> begins and job running status on "qstat" disappears.
> But the job is still running on the nodes implicitly and just "qstat"
> give an information for "load_avg" number in each node.
> 
> Have any idea ?

When you call mpirun, did you end the line with '&' ?
If so, the mpijob would go into the 'background' thus causing the submit
script to exit while the mpi job continues to run.  As soon as the
submit script exits, SGE registers the job as having completed.


---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
For additional commands, e-mail: users-help at gridengine.sunsource.net




More information about the gridengine-users mailing list