[GE users] Scheduler tuning

Margaret Doll Margaret_Doll at brown.edu
Thu Nov 20 22:20:50 GMT 2008


I found that if I had my parallel environment set to "round-robin", I  
got into the
situation that you describe.

I switched the PE to "fill_up" and  the first compute nodes slots are  
all used
before requests are made to the next compute nodes


On Nov 19, 2008, at 12:09 PM, Bob Healey wrote:

> Thank you everyone who responded over night.  I've taken the  
> suggestions
> from the three emails I saw today and will be trying them out, if it
> doesn't change anything, I'll post again.  For the person who asked,  
> the
> only resource being requested is slots.  I still haven't gotten the  
> end
> users to put in h_rt limits yet so I can get backfilling working.  But
> that's a people problem, not a tech issue.
>
> Bob Healey
> Systems Administrator
> Molecularium Project
> Department of Physics, Applied Physics, and Astronomy
> healer at rpi.edu
>
> ==============Original message text===============
> On Wed, 19 Nov 2008 5:11:41 EST andreas wrote:
>
> Hi Robert,
>
> I don't overlook your setup, but load adjustment has a saying in
> parallel scheduler allocation. Try to replace
>
>> job_load_adjustments              np_load_avg=0.50
>> load_adjustment_decay_time        0:7:30
>
> with
>
>> job_load_adjustments              NONE
>> load_adjustment_decay_time        0:0:0
>
> it is possible that scheduling works then as you expect.
>
> Regards,
> Andreas
>
> On Wed, 19 Nov 2008, Robert Healey wrote:
>
>> I'm currently using that flag, doesn't seem to help too much.  I also
>> use slots as the scheduling criteria instead of load.
>>
>> Bob Healey
>>
>>
>> qstat -t:
>>  12645 0.51180 submit-run leyva        r     11/18/2008 16:02:12
>> terra at compute-8-9.local        MASTER                        r
>> 00:00:02 0.15357 0.00000
>>
>> terra at compute-8-9.local        SLAVE            1.compute-8-9 r
>> 1:17:38:46 51351.78424 0.00000
>>
>> terra at compute-8-9.local        SLAVE
>>
>> terra at compute-8-9.local        SLAVE
>>
>> terra at compute-8-9.local        SLAVE
>>  12645 0.51180 submit-run leyva        r     11/18/2008 16:02:12
>> terra at compute-8-10.local       SLAVE
>>
>> terra at compute-8-10.local       SLAVE            1.compute-8-10 r
>> 1:17:41:06 51175.77672 0.00000
>>
>> terra at compute-8-10.local       SLAVE
>>
>> terra at compute-8-10.local       SLAVE
>>
>> pe_name            openmpi
>> slots              1310
>> user_lists         NONE
>> xuser_lists        NONE
>> start_proc_args    /bin/true
>> stop_proc_args     /bin/true
>> allocation_rule    $fill_up
>> control_slaves     TRUE
>> job_is_first_task  FALSE
>> urgency_slots      min
>> accounting_summary TRUE
>>
>> qconf -msconf:
>> [root at terra ~]# qconf -msconf
>>
>> algorithm                         default
>> schedule_interval                 0:0:15
>> maxujobs                          0
>> queue_sort_method                 seqno
>> job_load_adjustments              np_load_avg=0.50
>> load_adjustment_decay_time        0:7:30
>> load_formula                      slots
>> schedd_job_info                   true
>> flush_submit_sec                  0
>> flush_finish_sec                  0
>> params                            none
>> reprioritize_interval             0:0:0
>> halftime                          168
>> usage_weight_list                  
>> cpu=1.000000,mem=0.000000,io=0.000000
>> compensation_factor               5.000000
>> weight_user                       0.250000
>> weight_project                    0.250000
>> weight_department                 0.250000
>> weight_job                        0.250000
>> weight_tickets_functional         0
>> weight_tickets_share              0
>> share_override_tickets            TRUE
>> share_functional_shares           TRUE
>> max_functional_jobs_to_schedule   200
>> report_pjob_tickets               TRUE
>> max_pending_tasks_per_job         50
>> halflife_decay_list               none
>> policy_hierarchy                  OFS
>> weight_ticket                     0.010000
>> weight_waiting_time               0.000000
>> weight_deadline                   3600000.000000
>> weight_urgency                    0.100000
>> weight_priority                   1.000000
>> max_reservation                   1024
>> default_duration                  96:00:00
>>
>> rayson wrote:
>>> I think you can play with the "allocation_rule" in your PE setting,
>>> esp. the "$fill_up" flag:
>>>
>>> http://gridengine.sunsource.net/nonav/source/browse/~checkout~/ 
>>> gridengine/doc/htmlman/htmlman5/sge_pe.html>>
>>> Rayson
>>>
>>>
>>>
>>> On 11/18/08, Robert Healey <healer at rpi.edu> wrote:
>>>> Hello.
>>>>
>>>> I'm currently running Grid Engine across a 1032 processor/129 node
>>>> cluster.  Most of my jobs submitted are parallel MPI jobs, with  
>>>> 8-208
>>>> slots requested/job.  I've been finding that even with 3-4 idle  
>>>> nodes,
>>>> an 8 slot job will be split among 2-3 nodes when the ideal in my
>>>> circumstances is to run all 8 slots on a single 8 core node.  I've
>>>> defined all the nodes as having 8 slots, and am looking for  
>>>> things in
>>>> the scheduler config to tweak to better schedule the CPU time.
>>>>
>>>> Thank you.
>>>> --
>>>> Bob Healey
>>>> Systems Administrator
>>>> Physics Department, RPI
>>>> healer at rpi.edu
>>>>
>>>> ------------------------------------------------------
>>>> http://gridengine.sunsource.net/ds/viewMessage.do?dsForumId=38&dsMessageId=88987 
>>>> >>>
>>>> To unsubscribe from this discussion, e-mail:
> [users-unsubscribe at gridengine.sunsource.net].
>>>>
>>>
>>> ------------------------------------------------------
>>> http://gridengine.sunsource.net/ds/viewMessage.do?dsForumId=38&dsMessageId=89037 
>>> >>
>>> To unsubscribe from this discussion, e-mail:
> [users-unsubscribe at gridengine.sunsource.net].
>>>
>>
>> -- 
>> Bob Healey
>> Systems Administrator
>> Physics Department, RPI
>> healer at rpi.edu
>>
>> ------------------------------------------------------
>> http://gridengine.sunsource.net/ds/viewMessage.do?dsForumId=38&dsMessageId=89038 
>> >
>> To unsubscribe from this discussion, e-mail:
> [users-unsubscribe at gridengine.sunsource.net].
>>
>
> http://gridengine.info/
> Sitz der Gesellschaft: Sun Microsystems GmbH, Sonnenallee 1, D-85551
> Kirchheim-Heimstetten
> Amtsgericht Muenchen: HRB 161028
> Geschaeftsfuehrer: Thomas Schroeder, Wolfgang Engels, Dr. Roland  
> Boemer
> Vorsitzender des Aufsichtsrates: Martin Haering
>
> ------------------------------------------------------
> http://gridengine.sunsource.net/ds/viewMessage.do?dsForumId=38&dsMessageId=89049
> To unsubscribe from this discussion, e-mail:
> [users-unsubscribe at gridengine.sunsource.net].
>
> ===========End of original message text===========
>
> ------------------------------------------------------
> http://gridengine.sunsource.net/ds/viewMessage.do?dsForumId=38&dsMessageId=89132
>
> To unsubscribe from this discussion, e-mail: [users-unsubscribe at gridengine.sunsource.net 
> ].

------------------------------------------------------
http://gridengine.sunsource.net/ds/viewMessage.do?dsForumId=38&dsMessageId=89279

To unsubscribe from this discussion, e-mail: [users-unsubscribe at gridengine.sunsource.net].



More information about the gridengine-users mailing list