[GE users] Fair share config, fill-up hosts and max user slots

Jean-Paul Minet minet at cism.ucl.ac.be
Tue Jan 3 09:47:04 GMT 2006


    [ The following text is in the "ISO-8859-1" character set. ]
    [ Your display is set for the "ISO-8859-10" character set.  ]
    [ Some special characters may be displayed incorrectly. ]

Stephan,

Thks for your help...

Here is the sharetree config

-----------
id=0
name=Root
type=0
shares=1
childnodes=1
id=1
name=default
type=0
shares=10000
childnodes=NONE
------------

the sched config:

-----------------
algorithm                         default
schedule_interval                 0:02:00
maxujobs                          8
queue_sort_method                 load
job_load_adjustments              np_load_avg=0.50
load_adjustment_decay_time        0:7:30
load_formula                      slots
schedd_job_info                   true
flush_submit_sec                  0
flush_finish_sec                  0
params                            profile=1
reprioritize_interval             0:0:0
halftime                          336
usage_weight_list                 cpu=0.848000,mem=0.152000,io=0.000000
compensation_factor               5.000000
weight_user                       0.000000
weight_project                    1.000000
weight_department                 0.000000
weight_job                        0.000000
weight_tickets_functional         0
weight_tickets_share              1000000
share_override_tickets            TRUE
share_functional_shares           TRUE
max_functional_jobs_to_schedule   200
report_pjob_tickets               TRUE
max_pending_tasks_per_job         50
halflife_decay_list               none
policy_hierarchy                  S
weight_ticket                     1.000000
weight_waiting_time               1.000000
weight_deadline                   3600000.000000
weight_urgency                    0.000000
weight_priority                   0.000000
max_reservation                   0
default_duration                  0:10:0
-----------------

I have attached the "qstat -ext" output example as text file to be more readable 
(note that I realise that some MPI jobs don't seem to report corretcly CPU time 
and memory used, though I have used tight integration as described in the 
manual/HowTo)

Thanks again for you help

Jean-paul


Stephan Grell - Sun Germany - SSG - Software Engineer wrote:
> 
> Jean-Paul Minet wrote On 12/23/05 11:39,:
> 
> 
>>Hi,
>>
>>Our bi-proc cluster is used for sequential, OpenMP and MPI jobs.  We wish to:
>>
>>1) use fair-share scheduling with equal shares for all users
>>
>>I have disabled Priority and Urgency scheduling, and set policy hierarchy to S.:
>>
>>lemaitre ~ # qconf -ssconf
>>algorithm                         default
>>...
>>halftime                          336
>>usage_weight_list                 cpu=0.848000,mem=0.152000,io=0.000000
>>...
>>weight_tickets_functional         0
>>weight_tickets_share              10000
>>...
>>policy_hierarchy                  S
>>weight_ticket                     1.000000
>>...
>>weight_urgency                    0.000000
>>weight_priority                   0.000000
>>
>>Under the share tree policy, I have only defined a default leaf under which all 
>>users appear, but "Actual resource share" and "Targeted resource share" remain 0 
>>for all users, as if actual usage was not taken into account?  This is confirmed 
>>by jobs being dispatched more like in FIFO order than following past usage. 
>>What's wrong?
>>
> 
> Hm.. that sounds like a bug. But I am pretty sure, that it is not. Could
> you post
> the entire scheduler configuration, the share tree configuration and the
> qstat -ext
> output?
> 
> Thanks.
> 
> 
>>2) limit the total number of CPUs/slots used by any user at any time: 
>>MaxJobs/User doesn't help as a single MPI job can use many slots and therefore 
>>cannot compare to a sequential job.  How can we implement this?
>>
> 
> You can limit the number of slots a pe can utilize in the pe
> configuration. However,
> you cannot limit the number of slots a user can utilize.
> 
> Cheers,
> Stephan
> 
> 
>>3) fill-up hosts with sequential jobs to leave as many empty nodes for OpenMP 
>>and MPI jobs.  I have read Stephen G. WebL Log: am I correct in assuming that I 
>>have to define a complex_values slots=2 for each of the biproc host (we don't 
>>want more jobs than CPU) and, thereafter, the scheduler will select the hosts 
>>with the least available slots (setting of course queue_sort_method=load and 
>>load_formula=slots) ?
>>
>>Thanks for any help
>>
>>Jean-Paul
>>
>>---------------------------------------------------------------------
>>To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
>>For additional commands, e-mail: users-help at gridengine.sunsource.net
>>
>> 
>>
> 
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
> For additional commands, e-mail: users-help at gridengine.sunsource.net
> 
> 
> 

-- 
Jean-Paul Minet
Gestionnaire CISM - Institut de Calcul Intensif et de Stockage de Masse
Université Catholique de Louvain
Tel: (32) (0)10.47.35.67 - Fax: (32) (0)10.47.34.52


    [ Part 2: "Attached Text" ]

job-ID  prior   ntckts  name       user         project          department state cpu        mem     io      tckts ovrts otckt ftckt stckt share queue                          slots ja-task-ID
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
   1145 0.50000 0.50000 defkpt13   sdubois      grppcpm          grppcpm    r     0:00:00:00 0.00361 0.00000     0     0     0     0     0 0.00  all.q at lmexec-101                   6
   1187 0.50000 0.50000 VX4SYST    bricteux     grpterm          grpterm    r     0:00:00:19 0.10928 0.00000     0     0     0     0     0 0.00  all.q at lmexec-104                   1
   1176 0.50000 0.50000 pdir1      sdubois      grppcpm          grppcpm    r     0:00:00:00 0.00285 0.00000     0     0     0     0     0 0.00  all.q at lmexec-105                   6
   1142 0.50000 0.50000 defkpt10   sdubois      grppcpm          grppcpm    r     0:00:00:01 0.00762 0.00000     0     0     0     0     0 0.00  all.q at lmexec-106                   6
   1139 0.50000 0.50000 axiaingaus leyssens     grpchim          grpchim    r     2:20:35:22 167440.42888 0.00000     0     0     0     0     0 0.00  all.q at lmexec-110                   4
   1153 0.50000 0.50000 lemaitre   detraux      grppcpm          grppcpm    r     1:01:00:29 98233.55565 0.00000     0     0     0     0     0 0.00  all.q at lmexec-112                   1
   1154 0.50000 0.50000 lemaitre   detraux      grppcpm          grppcpm    r     1:01:00:00 91404.93447 0.00000     0     0     0     0     0 0.00  all.q at lmexec-112                   1
   1146 0.50000 0.50000 defkpt14   sdubois      grppcpm          grppcpm    r     0:00:00:00 0.00361 0.00000     0     0     0     0     0 0.00  all.q at lmexec-117                   6
   1144 0.50000 0.50000 defkpt12   sdubois      grppcpm          grppcpm    r     0:00:00:01 0.00654 0.00000     0     0     0     0     0 0.00  all.q at lmexec-119                   6
   1143 0.50000 0.50000 defkpt11   sdubois      grppcpm          grppcpm    r     0:00:00:01 0.00722 0.00000     0     0     0     0     0 0.00  all.q at lmexec-64                    6
   1149 0.50000 0.50000 C0         detraux      grppcpm          grppcpm    r     1:01:26:25 96938.44470 0.00000     0     0     0     0     0 0.00  all.q at lmexec-72                    1
   1150 0.50000 0.50000 C1         detraux      grppcpm          grppcpm    r     1:01:25:26 100048.60434 0.00000     0     0     0     0     0 0.00  all.q at lmexec-73                    1
   1156 0.50000 0.50000 CONF1.1    detraux      grppcpm          grppcpm    r     0:17:48:09 47115.94280 0.00000     0     0     0     0     0 0.00  all.q at lmexec-79                    1
   1179 0.50000 0.50000 equaoutgau leyssens     grpchim          grpchim    r     0:01:21:36 2905.26267 0.00000     0     0     0     0     0 0.00  all.q at lmexec-79                    4
   1147 0.50000 0.50000 equaoutgau leyssens     grpchim          grpchim    r     0:22:06:50 51106.26925 0.00000     0     0     0     0     0 0.00  all.q at lmexec-92                    4
   1155 0.50000 0.50000 lemaitre   detraux      grppcpm          grppcpm    r     1:01:03:29 91895.95229 0.00000     0     0     0     0     0 0.00  all.q at lmexec-93                    1
   1141 0.50000 0.50000 defkpt09   sdubois      grppcpm          grppcpm    r     0:00:00:01 0.00769 0.00000     0     0     0     0     0 0.00  all.q at lmexec-95                    6
   1151 0.50000 0.50000 C0         detraux      grppcpm          grppcpm    r     1:01:19:35 94207.34235 0.00000     0     0     0     0     0 0.00  all.q at lmexec-96                    1
   1152 0.50000 0.50000 C1         detraux      grppcpm          grppcpm    r     1:01:17:53 87349.30588 0.00000     0     0     0     0     0 0.00  all.q at lmexec-99                    1
   1131 0.50000 0.50000 DAM        marchand     grpgce           grpgce     qw                                   0     0     0     0     0 0.00                                     2
   1157 0.00000 0.00000 CONF2      detraux      grppcpm          grppcpm    qw                                   0     0     0     0     0 0.00                                     1
   1158 0.00000 0.00000 CONF2.1    detraux      grppcpm          grppcpm    qw                                   0     0     0     0     0 0.00                                     1
   1159 0.00000 0.00000 CONF2.2    detraux      grppcpm          grppcpm    qw                                   0     0     0     0     0 0.00                                     1
   1160 0.00000 0.00000 CONF3      detraux      grppcpm          grppcpm    qw                                   0     0     0     0     0 0.00                                     1
   1161 0.00000 0.00000 INT.0      detraux      grppcpm          grppcpm    qw                                   0     0     0     0     0 0.00                                     1
   1162 0.00000 0.00000 lemaitre   detraux      grppcpm          grppcpm    qw                                   0     0     0     0     0 0.00                                     1
   1163 0.00000 0.00000 lemaitre   detraux      grppcpm          grppcpm    qw                                   0     0     0     0     0 0.00                                     1
   1164 0.00000 0.00000 R7         detraux      grppcpm          grppcpm    qw                                   0     0     0     0     0 0.00                                     1
   1165 0.00000 0.00000 R8         detraux      grppcpm          grppcpm    qw                                   0     0     0     0     0 0.00                                     1
   1166 0.00000 0.00000 CDS.LDA    detraux      grppcpm          grppcpm    qw                                   0     0     0     0     0 0.00                                     1




    [ Part 3: "Attached Text" ]

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
For additional commands, e-mail: users-help at gridengine.sunsource.net



More information about the gridengine-users mailing list