[GE users] Long delay when submitting large jobs

John Hearns john.hearns at streamline-computing.com
Mon Feb 7 13:15:57 GMT 2005


On Mon, 2005-02-07 at 10:18 +0100, Reuti wrote:

> 
> - MPICH2 has it's own mpiexec (which has just this name, and not the function 
> of the PBS tight-integration program). A new name would be better chosen for 
> this tight-mpiexec. And: you can compile MPICH2 startup in more than one way. 
> I'm not sure, whether all are supported, as the supplied version is for a beta 
> version of MPICH2 only.
> 
> But please keep the start_proc_args/stop_proc_args anyway, as we need special 
> directories to be created on the slave nodes. Maybe it's personal taste, but I 
> would still prefer setting up all the daemons for LAM/MPICH2 in 
> start_proc_args, and the end-user can just use mpirun/mpich2-mpiexec, since all 
> is already setup.

You raise a good point - it has probably been discussed before.
How do we handle MPICH2 and MPD daemons in SGE?
If MPD is used to start/stop the worker processes, then what role does
SGE play, and how can SGE do tight integration, accounting, and keeping
track of used/unused job slots?

Sorry if there is a clear answer here - just trying to learn.


---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
For additional commands, e-mail: users-help at gridengine.sunsource.net




More information about the gridengine-users mailing list