Opened 14 years ago
Last modified 10 years ago
#418 new enhancement
IZ2226: Testsuite needs relevant performance test for scheduler dispatch times with resource quotas
Reported by: | andreas | Owned by: | |
---|---|---|---|
Priority: | normal | Milestone: | |
Component: | sge | Version: | 6.1beta2 |
Severity: | Keywords: | testsuite | |
Cc: |
Description
[Imported from gridengine issuezilla http://gridengine.sunsource.net/issues/show_bug.cgi?id=2226]
Issue #: 2226 Platform: All Reporter: andreas (andreas) Component: gridengine OS: All Subcomponent: testsuite Version: 6.1beta2 CC: None defined Status: NEW Priority: P3 Resolution: Issue type: ENHANCEMENT Target milestone: --- Assigned to: joga (joga) QA Contact: joga URL: * Summary: Testsuite needs relevant performance test for scheduler dispatch times with resource quotas Status whiteboard: Attachments: Issue 2226 blocks: Votes for issue 2226: Opened: Thu Mar 29 08:50:00 -0700 2007 ------------------------ DESCRIPTION: Configure 100 queues Q001-Q100. The queues should have no load thresholds, 4 slots per node and be available for @allhosts. Use at least 4 execution hosts as to get a halfway realistic performance behavoir. Enable scheduler profiling with sched_conf(5) params setting PROFILE=true. Configure five projects Project1-Project5. Configure 10 INT consumable resources F001-F010 and with each of them having a global capacity of 100. Configure a resource quota that limits use of F001-F010 to 1 per project limit projects {*} to F001=1,F002=1,F003=1,F004=1,F005=1,F006=1,F007=1,F008=1,F009=1,F010=1 before job submission disable all queues using qmod -d "*" and remove the sge_schedd messages file. Then submit 1000 sequential jobs: For each of the five projects submit a series of 20 identical jobs requesting -l F001=1 to -l F010=1 (5*10*20 = 1000). Jobs can be normal sleeper jobs that remain 5 minutes pending or even more. When all jobs are submitted enable all queues using qmod -e "*" and record the first 'n' schedd profiling messages with job dispatching time contained using # grep "job dispatching took" $SGE_ROOT/default/spool/qmaster/schedd/messages | head -7 Before the fix the follwing numbers were typical 03/28/2007 17:49:59|schedd|es-ergb01-01|P|PROF: job dispatching took 3.840 s (1000 fast, 0 comp, 0 pe, 0 res) 03/28/2007 17:50:05|schedd|es-ergb01-01|P|PROF: job dispatching took 2.180 s (950 fast, 0 comp, 0 pe, 0 res) 03/28/2007 17:50:10|schedd|es-ergb01-01|P|PROF: job dispatching took 2.160 s (950 fast, 0 comp, 0 pe, 0 res) 03/28/2007 17:50:15|schedd|es-ergb01-01|P|PROF: job dispatching took 2.170 s (950 fast, 0 comp, 0 pe, 0 res) 03/28/2007 17:50:20|schedd|es-ergb01-01|P|PROF: job dispatching took 2.130 s (950 fast, 0 comp, 0 pe, 0 res) 03/28/2007 17:50:26|schedd|es-ergb01-01|P|PROF: job dispatching took 2.170 s (950 fast, 0 comp, 0 pe, 0 res) 03/28/2007 17:50:31|schedd|es-ergb01-01|P|PROF: job dispatching took 2.170 s (950 fast, 0 comp, 0 pe, 0 res) after the fix these numbers are typical 03/29/2007 16:24:10|schedd|es-ergb01-01|P|PROF: job dispatching took 1.580 s (1000 fast, 0 comp, 0 pe, 0 res) 03/29/2007 16:24:12|schedd|es-ergb01-01|P|PROF: job dispatching took 0.020 s (950 fast, 0 comp, 0 pe, 0 res) 03/29/2007 16:24:17|schedd|es-ergb01-01|P|PROF: job dispatching took 0.020 s (950 fast, 0 comp, 0 pe, 0 res) 03/29/2007 16:24:22|schedd|es-ergb01-01|P|PROF: job dispatching took 0.030 s (950 fast, 0 comp, 0 pe, 0 res) 03/29/2007 16:24:27|schedd|es-ergb01-01|P|PROF: job dispatching took 0.030 s (950 fast, 0 comp, 0 pe, 0 res) 03/29/2007 16:24:32|schedd|es-ergb01-01|P|PROF: job dispatching took 0.030 s (950 fast, 0 comp, 0 pe, 0 res) 03/29/2007 16:24:37|schedd|es-ergb01-01|P|PROF: job dispatching took 0.030 s (950 fast, 0 comp, 0 pe, 0 res) ------- Additional comments from andreas Thu Mar 29 09:53:19 -0700 2007 ------- A worthwhile variarion of this test is to (a) send all jobs only into a single queue Q001, (b) increase slot amount of Q001 to 25 so that still 50 jobs can be dispatched at a time, and (c) use a limitation rule that applies only to Q001 limit queues Q001 projects {*} to F001=1,F002=1,F003=1,F004=1,F005=1,F006=1,F007=1,F008=1,F009=1,F010=1 as for sending jobs into queue Q001 various possibilities exist: (1) request "-q Q001" (2) attach a user defined string complex attribute 'type' to all queues with type=<qname> and have jobs request "-l type=Q001" (3) attach a user defined int complex attribute 'number' to all queues with number='queue-number' and have jobs request "-l number=001"
Note: See
TracTickets for help on using
tickets.