[GE users] Startup times and other issues with 6.0u3

Ron Chen ron_chen_123 at yahoo.com
Sat Mar 19 00:52:24 GMT 2005


--- Brian R Smith wrote:
> 1) The time it takes to actually start a process
> (parallel) seems to 
> increase in an almost exponential fashion to the
> number of nodes being 
> requested.

A problem was fixed recently, it was related to the
slow start performance of parallel jobs. See if
turning off "control slaves" in your PE improves
anything? If that works then wait for SGE 6 update 4,
it got this problem fixed.

However, 10 minutes is really long - I believe
something else is going on with your machine or your
setup or OS!


>  I've noticed that SGE does a lot of
> communications prior to 
> execution, checking loads, getting stats, etc. but
> that still doesn't 
> explain why it is taking nearly 10 minutes to begin
> executing a 42 
> processor job on a completely un-utilized cluster. 
> Any pointers on this 
> would be great.
> 
> 2) It seems that under tight integration with MPICH,
> I am getting some 
> strange inconsistencies with some of my codes,
> specifically MM5.  
> Running the mpirun command directly from the shell,
> with no 'rsh' call 
> interceptions, seems to work perfect.  However, once
> this code is 
> executed from the SGE environment, there are some
> serious communications 
> issues (message passing wise) that slow down these
> codes horribly.  This 
> is an intermittent problem that happens to about 30%
> of MM5 jobs.  I was 
> considering creating another Parallel Environment
> that was not tightly 
> integrated and was wondering if anyone has
> experienced this before I 
> charge ahead, possibly in the wrong direction.

Is reprioritization on?

Also, more info about the hardware/software used would
be helpful.

 -Ron


> 
> I happen to be running CentOS 4 (RHEL rebuild) with
> SELINUX disabled and 
> kerberos-enabled rsh removed from the path (we use
> straight rsh on the 
> nodes).  If anyone knows of any issues with this
> configuration, let me know.
> 
> Thanks
> 
> Brian Smith
> 
>
---------------------------------------------------------------------
> To unsubscribe, e-mail:
> users-unsubscribe at gridengine.sunsource.net
> For additional commands, e-mail:
> users-help at gridengine.sunsource.net
> 
> 


		
__________________________________ 
Do you Yahoo!? 
Yahoo! Small Business - Try our new resources site!
http://smallbusiness.yahoo.com/resources/ 

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
For additional commands, e-mail: users-help at gridengine.sunsource.net




More information about the gridengine-users mailing list