[GE users] Startup times and other issues with 6.0u3

Sean Dilda agrajag at dragaera.net
Sat Mar 19 14:19:54 GMT 2005


    [ The following text is in the "ISO-8859-1" character set. ]
    [ Your display is set for the "ISO-8859-10" character set.  ]
    [ Some special characters may be displayed incorrectly. ]

Ron Chen wrote:
> --- Brian R Smith <brian at cypher.acomp.usf.edu> wrote: 
> 
>>You are absolutely the man.  Setting "control
>>slaves" to false fixed all of my problems.
> 
> 
> No, it is not fixing anything!
> 
> "control slaves" means non-tight integration, so you
> won't get process control/accounting of the slaves MPI
> tasks.
> 
> In SGE 6 update 4, the slow start problem was fixed.
> But the original problem was that starting a 400-node
> parallel job with tight integration takes several tens
> seconds or something. But for your case it takes 10
> minutes! So there is still something going on with
> your configuration.

I've seem delays on the order of 5 minutes with 30 and 40-cpu jobs that 
I believe are related to the bug that's fixed in u4.  I think the people 
who only saw 10 or 20 second delays were lucky.

Brian, when you say delay, what do you mean?  Is the job allocated 
nodes, but sitting in 't' state for 10 minutes before it switches to 'r' 
?  If so, then it does sound like the bug that will be fixed when u4 
comes out.  However, Ron is right.  Turning off control slaves doesn't 
"fix" it, unless you don't care about tight-integration.

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
For additional commands, e-mail: users-help at gridengine.sunsource.net




More information about the gridengine-users mailing list