[GE users] Startup times and other issues with 6.0u3

Brian R Smith brian at cypher.acomp.usf.edu
Mon Mar 21 16:08:44 GMT 2005


Stephan,

I just downgraded to SGE 6.0u1 to see if u3 was the problem.  Since u1
didn't give me any trouble on our other clusters, I figured it would do
for now.  It looks like u1 doesn't have the -dump option for qping.

************

[root at mimir ~]# qping -dump mimir 5000 qmaster 1
usage: qping [-i <interval>] [-info] [-f] [-noalias] <host> <port>
<name> <id>
   -i       : set ping interval time
   -info    : show full status information and exit
   -f       : show full status information on each ping interval
   -noalias : ignore $SGE_ROOT/SGE_CELL/common/host_aliases file
   host     : host name of running component
   port     : port number of running component
   name     : name of running component (e.g.: "qmaster" or "execd")
   id       : id of running component (e.g.: 1 for daemons)

example:
qping -info clustermaster 5000 qmaster 1

************

We're running a bunch of netperf tests.  It is my belief that this
problem may not be an SGE issue but a problem with some component of our
hardware, to be more specific, I am suspecting the switch more than
anything.

I'll keep you guys posted on what happens.

Thanks for all the help.

Brian

On Mon, 2005-03-21 at 11:25 +0100, Stephan Grell - Sun Germany - SSG -
Software Engineer wrote:
> Brian,
> 
> sorry for the late reply. I am a bit shocked to read about 7-10 min. startup
> time. The u4 fix might help, but I did not see delays longer than 1 min with
> much bigger jobs.
> 
> You could use qping -dump for monitoring the traffic between the qmaster and
> the execd. It will give you the time stamps when which client send what.
> 
> Is it possible to post the output for an empty cluster with a starting mpi
> job?
> 
> Stephan
> 
> Brian R Smith wrote:
> 
> >Reuti,
> >
> >Right, 't' only tells me what nodes have been allocated to run the job.  
> >The job does not start until 'r'.  Makes perfect sense.
> >However, I can attest to the 7-10 minute wait times.  When 
> >tight-integration is turned off, processes start up within a couple of 
> >seconds (plus the time it takes for the scheduler to "make its rounds").
> >
> >Brian
> >
> >Reuti wrote:
> >
> >  
> >
> >>Hi Brian,
> >>
> >>the status 't' is *not* a real-time display, whether the job is generating any 
> >>CPU load. But, I must admit, that I saw only a delay of about 1-2 minutes 
> >>before it changed to 'r'. Maybe it's related to the PE startup delay in u3.
> >>
> >>When the job is started, it may already been working although the status is 
> >>'t'. More informative is to look at the CPU usage on the node with "top" or "ps 
> >>-e f -o pid,time,command".
> >>
> >>CU - Reuti
> >>
> >>Quoting Brian R Smith <brian at cypher.acomp.usf.edu>:
> >>
> >> 
> >>
> >>    
> >>
> >>>Sean,
> >>>
> >>>That is exacly what happens, allocation occurs and job waits in 't' 
> >>>state for 7-10 minutes.  I've reenabled "control slaves" because I 
> >>>figured I could live with this problem till u4 comes out (not that many 
> >>>people run 42 node, cluster spanning jobs).  My big concern right now is 
> >>>with running MM5 under SGE as there seems to be some problems with 
> >>>message passing.
> >>>
> >>>Brian
> >>>
> >>>Sean Dilda wrote:
> >>>
> >>>   
> >>>
> >>>      
> >>>
> >>>>Ron Chen wrote:
> >>>>
> >>>>     
> >>>>
> >>>>        
> >>>>
> >>>>>--- Brian R Smith <brian at cypher.acomp.usf.edu> wrote:
> >>>>>
> >>>>>       
> >>>>>
> >>>>>          
> >>>>>
> >>>>>>You are absolutely the man.  Setting "control
> >>>>>>slaves" to false fixed all of my problems.
> >>>>>>         
> >>>>>>
> >>>>>>            
> >>>>>>
> >>>>>No, it is not fixing anything!
> >>>>>
> >>>>>"control slaves" means non-tight integration, so you
> >>>>>won't get process control/accounting of the slaves MPI
> >>>>>tasks.
> >>>>>
> >>>>>In SGE 6 update 4, the slow start problem was fixed.
> >>>>>But the original problem was that starting a 400-node
> >>>>>parallel job with tight integration takes several tens
> >>>>>seconds or something. But for your case it takes 10
> >>>>>minutes! So there is still something going on with
> >>>>>your configuration.
> >>>>>       
> >>>>>
> >>>>>          
> >>>>>
> >>>>I've seem delays on the order of 5 minutes with 30 and 40-cpu jobs 
> >>>>that I believe are related to the bug that's fixed in u4.  I think the 
> >>>>people who only saw 10 or 20 second delays were lucky.
> >>>>
> >>>>Brian, when you say delay, what do you mean?  Is the job allocated 
> >>>>nodes, but sitting in 't' state for 10 minutes before it switches to 
> >>>>'r' ?  If so, then it does sound like the bug that will be fixed when 
> >>>>u4 comes out.  However, Ron is right.  Turning off control slaves 
> >>>>doesn't "fix" it, unless you don't care about tight-integration.
> >>>>
> >>>>---------------------------------------------------------------------
> >>>>To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
> >>>>For additional commands, e-mail: users-help at gridengine.sunsource.net
> >>>>     
> >>>>
> >>>>        
> >>>>
> >>>---------------------------------------------------------------------
> >>>To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
> >>>For additional commands, e-mail: users-help at gridengine.sunsource.net
> >>>
> >>>   
> >>>
> >>>      
> >>>
> >>
> >>---------------------------------------------------------------------
> >>To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
> >>For additional commands, e-mail: users-help at gridengine.sunsource.net
> >> 
> >>
> >>    
> >>
> >
> >
> >---------------------------------------------------------------------
> >To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
> >For additional commands, e-mail: users-help at gridengine.sunsource.net
> >
> >  
> >
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
> For additional commands, e-mail: users-help at gridengine.sunsource.net
-- 
_______________________________________
|  Brian R Smith                      |
|  Systems Administrator              |
|  Research Computing Core Facility   |
|  University of South Florida        |
|  Phone: 1(813)974-1467              |
|  4202 E Fowler Ave, LIB 613         |
_______________________________________


---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
For additional commands, e-mail: users-help at gridengine.sunsource.net




More information about the gridengine-users mailing list