[GE users] Long delay when submitting large jobs

Bogdan Costescu Bogdan.Costescu at iwr.uni-heidelberg.de
Tue Feb 15 13:16:22 GMT 2005


On Tue, 15 Feb 2005, Andy Schwierskott wrote:

> There might be a misunderstanding about qmaster involvement

As I wrote, I didn't look at the source for quite some time, so I did 
not want to give precise examples ;-)

But you did...

>     The root of the problem in my opinion is that for every task the chain of
>     calling "qrsh -inherit", connecting to the execd, execd has to start the
>     shepherd which in turn start the rshd (or sshd) which in trun start the
>     parallel task has to be executed.

and this is exactly what I would include in the description "a simple
loop over some blocking calls". Apart from the involved procedure, the 
'qrsh -inherit' calls should not happen linearly. With today's setup, 
both MPICH and LAM-MPI jobs are started in this linear fashion, simply 
because SGE does not offer a multi-spawn utilitity/API.

This also leads to waste of CPU time, which is contrary to the purpose
of a resource manager IMHO. If starting a process on a node takes 1
second, for 100 nodes it will take 100 seconds - that's 100 seconds
when 100 nodes are doing (almost) nothing; it's only after all
processes have started that the parallel job passes through MPI_Init.
Given that the slave nodes already know from the qmaster that the
processes are allowed to run there, what's the point of contacting the
execds one by one ? I certainly understand that there is no infinite
scaling and therefore I don't expect that the job will start in 1
second independent of the number of nodes, but I strongly believe that
something under 10 seconds can be achieved for this particular
example - and that's already one order of magnitude faster.

-- 
Bogdan Costescu

IWR - Interdisziplinaeres Zentrum fuer Wissenschaftliches Rechnen
Universitaet Heidelberg, INF 368, D-69120 Heidelberg, GERMANY
Telephone: +49 6221 54 8869, Telefax: +49 6221 54 8868
E-mail: Bogdan.Costescu at IWR.Uni-Heidelberg.De




---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
For additional commands, e-mail: users-help at gridengine.sunsource.net




More information about the gridengine-users mailing list