AW: [GE users] [GE users] [GE users] PE config: allocation_rule

owissdorf oliver.wissdorf at boehringer-ingelheim.com
Thu Apr 15 09:01:21 BST 2010


    [ The following text is in the "utf-8" character set. ]
    [ Your display is set for the "ISO-8859-10" character set.  ]
    [ Some characters may be displayed incorrectly. ]

Hello Ansgar,

there is a FILL_UP Job-Distribution Howto at:

http://blogs.sun.com/sgrell/entry/grid_engine_scheduler_hacks_least


Job distribution is then not load based but there is then a fill_up mechanism
to use least used hosts first.

You have to create a load value :

qconf -msconf :


algorithm default 
schedule_interval 0:2:0 
maxujobs 0 
queue_sort_method load 
job_load_adjustments NONE 
load_adjustment_decay_time 0:0:0 
load_formula slots 
schedd_job_info true 
flush_submit_sec 1 
flush_finish_sec 1

and then change the complex value on every node :


qconf -mattr exechost complex_values slots=8  $host

In my example I have 8 cores per node. 

I hope this helps.

Regards,

Oliver


-----Ursprüngliche Nachricht-----
Von: aeszter [mailto:Ansgar.Esztermann at mpi-bpc.mpg.de] 
Gesendet: Mittwoch, 14. April 2010 10:30
An: users at gridengine.sunsource.net
Betreff: [GE users] [GE users] [GE users] PE config: allocation_rule

Hello everyone,


I am looking for advice regarding our PEs' allocation_rules. We would like
our job to be compact, i.e. use as few nodes as possible. On a previous
system running torque, our users would call qsub with something like -l
nodes=4:ppn=8, so 32-CPU job would be started on exactly 4 nodes. If such an
assignment was not available, the job would remain idle.
This has the disadvantage that users need to know the number of CPU cores per
node, but otherwise works fine.

Initially, I thought that $fill_up would be the way to go. However, on a
cluster heavily loaded with differently-sized jobs, we observed that jobs
tend to get "broader": after a few days, 32 cores might be allocated as
8+6+4+4+3+3+2+1+1 or so.
So we've tried allocation_rule 8 instead. This does work fine for jobs
requiring n*8 slots, but smaller ones (say, 4 slots) will not start at all.

Any thoughts?

BTW, is there any way to get a more detailed explanation for qalter -w [pv]'s
response "cannot run in PE xxx because it only offers 0 slots"?


Thanks a lot,

A.
-- 
Ansgar Esztermann
DV-Systemadministration
Max-Planck-Institut für biophysikalische Chemie, Abteilung 105

------------------------------------------------------
http://gridengine.sunsource.net/ds/viewMessage.do?dsForumId=38&dsMessageId=25
3354

To unsubscribe from this discussion, e-mail:
[users-unsubscribe at gridengine.sunsource.net].

------------------------------------------------------
http://gridengine.sunsource.net/ds/viewMessage.do?dsForumId=38&dsMessageId=25
3370

To unsubscribe from this discussion, e-mail:
[users-unsubscribe at gridengine.sunsource.net].

------------------------------------------------------
http://gridengine.sunsource.net/ds/viewMessage.do?dsForumId=38&dsMessageId=25
3469

To unsubscribe from this discussion, e-mail:
[users-unsubscribe at gridengine.sunsource.net].

------------------------------------------------------
http://gridengine.sunsource.net/ds/viewMessage.do?dsForumId=38&dsMessageId=253491

To unsubscribe from this discussion, e-mail: [users-unsubscribe at gridengine.sunsource.net].



More information about the gridengine-users mailing list