[GE users] scheduler bottleneck?

jobarjo jobarjo78 at yahoo.fr
Wed Mar 31 09:10:12 BST 2004


    [ The following text is in the "ISO-8859-1" character set. ]
    [ Your display is set for the "ISO-8859-10" character set.  ]
    [ Some special characters may be displayed incorrectly. ]

I have set
schedd_params       FLUSH_SUBMIT_SEC=0,FLUSH_FINISH_SEC=0

then I have about 20 to 40 job/s
the problem is that there is a 2 second latency between jobs.


Bryan Bayerdorffer wrote:

> I don't know if this is related to my earlier unresolved problem 
> (http://gridengine.sunsource.net/servlets/BrowseList?listName=users&by=thread&from=1703). 
>
>
> Right now we have a situation in which there are about 3000 pending 
> jobs being dispatched to ~50 exec hosts (1 slot each).  The majority 
> of these jobs have *extremely* short runtimes---just a few seconds.  
> The result is that many hosts are idle for a long time (a minute or 
> so) waiting for new jobs to be dispatched.  Users are complaining 
> because the total throughput for this job mix is a lot lower than it 
> was with LSF.  I'm wondering if the SGEEE scheduler is a bottleneck 
> here.  I have the schedule interval set to 10 seconds.  I enabled 
> profiling, and it seems that each scheduling run takes about 45 
> seconds.  This is on a 450MHz Ultra 60 with local /var/spool/sge, the 
> same host that used to run the LSF master.
>
> Anything I can tune to improve the performance for short jobs?  I've 
> thought of packaging several small jobs as one, but that would require 
> big changes in the way batch submission is scripted, and it's also 
> somewhat difficult to predict the runtime.
>
> What's "generate and send orders?"
>
> Tue Mar 30 17:21:19 2004|schedd|hai7|I|PROF: SGEEE job ticket 
> calculation: init: 0.320 s, pass 0: 0.180 s, pass 1: 0.000, pass2: 
> 0.000, calc: 0.350 s
> Tue Mar 30 17:21:19 2004|schedd|hai7|I|PROF: SGEEE job ticket 
> calculation: init: 0.010 s, pass 0: 0.010 s, pass 1: 0.000, pass2: 
> 0.000, calc: 0.000 s
> Tue Mar 30 17:21:19 2004|schedd|hai7|I|PROF: SGEEE update orders: job 
> orders: 0.590 s, update orders: 0.030 s
> Tue Mar 30 17:21:19 2004|schedd|hai7|I|PROF: SGEEE pending job ticket 
> calculation took 1.500 s
> Tue Mar 30 17:21:19 2004|schedd|hai7|I|PROF: SGEEE active job ticket 
> calculation took 0.020 s
> Tue Mar 30 17:21:19 2004|schedd|hai7|I|PROF: SGEEE job sorting took 
> 0.160 s
> Tue Mar 30 17:21:28 2004|schedd|hai7|I|PROF: SGEEE job dispatching 
> took 8.430 s
> Tue Mar 30 17:21:28 2004|schedd|hai7|I|PROF: scheduled in 10.600 (u 
> 10.400 + s 0.000 = 10.400): 8 fast, 0 complex, 2817 orders, 80 H, 267 
> Q, 621 QA, 0 J(qw), 53 J(r), 0 J(s), 0 J(h), 0 J(e), 8 J(x), 2812 
> J(all) 4 C, 1 ACL, 1 PE, 1 CONF, 116 U, 1 D, 0 PRJ, 1 ST, 0 CKPT, 0 RU
> Tue Mar 30 17:22:00 2004|schedd|hai7|I|PROF: generate and send orders 
> took: 32.020 s
> Tue Mar 30 17:22:01 2004|schedd|hai7|I|PROF: schedd run took: 44.570 s 
> (copying
> the lists took: 1.400 s)
>


---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
For additional commands, e-mail: users-help at gridengine.sunsource.net




More information about the gridengine-users mailing list