[GE users] subordinate queues and MPI jobs

Dalibor.Tokic Dalibor.Tokic at avinci.de
Wed May 5 16:09:54 BST 2004


    [ The following text is in the "iso-8859-1" character set. ]
    [ Your display is set for the "ISO-8859-10" character set.  ]
    [ Some special characters may be displayed incorrectly. ]

Hi!

You can't suspend a MPI-Job.
This is not an issue of SGE, but of MPI.
The problem are timeouts, that can arise.

Suspending a MPI-Job under SGE means killing it.


Bye

Charly

> -----Ursprüngliche Nachricht-----
> Von: Sean Dilda [mailto:agrajag at dragaera.net]
> Gesendet: Mittwoch, 5. Mai 2004 17:03
> An: users at gridengine.sunsource.net
> Betreff: [GE users] subordinate queues and MPI jobs
> 
> 
> Has anyone here played around with subordinate queues and MPI jobs?  I
> just played around with it some, and am somewhat disappointed by the
> results.
> 
> I'm running 5.3p4.  I have tight integration with MPICH setup 
> (I'm using
> an unmodified sshd though).
> 
> I started an MPI job running across several machines, using several
> subordinate queues.  I then launched another job.  It went 
> into a queue
> that should have caused the MPI job to suspend due to subordinate
> queues.  SGE properly listed all of the queues as suspended (all jobs
> had a state of 'S').  However, when I checked the process table on the
> compute nodes, I found a different story.
> 
> On the MASTER node for the mpi job, the job script as well as 
> the mpirun
> processes were stopped.  However, none of the child processes 
> of mpirun
> (the ones actually running my code) were stopped.  And none of the
> processes on other nodes were stopped.
> 
> Is this a known problem?  Is there something I can do to fix this
> behavior?  Am I trying to do something that isn't supported?
> 
> Thanks,
> 
> 
> Sean
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
> For additional commands, e-mail: users-help at gridengine.sunsource.net
> 
> 

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
For additional commands, e-mail: users-help at gridengine.sunsource.net




More information about the gridengine-users mailing list