[GE users] PE config: allocation_rule

reuti reuti at staff.uni-marburg.de
Wed Apr 14 10:54:32 BST 2010


Am 14.04.2010 um 10:29 schrieb aeszter:

> I am looking for advice regarding our PEs' allocation_rules. We would like our job to be compact, i.e. use as few nodes as possible. On a previous system running torque, our users would call qsub with something like -l nodes=4:ppn=8, so 32-CPU job would be started on exactly 4 nodes. If such an assignment was not available, the job would remain idle.
> This has the disadvantage that users need to know the number of CPU cores per node, but otherwise works fine.


> Initially, I thought that $fill_up would be the way to go. However, on a cluster heavily loaded with differently-sized jobs, we observed that jobs tend to get "broader": after a few days, 32 cores might be allocated as 8+6+4+4+3+3+2+1+1 or so.
> So we've tried allocation_rule 8 instead. This does work fine for jobs requiring n*8 slots, but smaller ones (say, 4 slots) will not start at all.


> Any thoughts?

You can define a second PE with allocation_rule 4 and both PEs should be named like mpi8 and mpi4, then you can request mpi* (quotation marks can prevent expansion by the shell) as PE and it will use either of them. Unfortunately there is no order by a sequence number or so for PE, it will just try both and pick one of them.

> BTW, is there any way to get a more detailed explanation for qalter -w [pv]'s response "cannot run in PE xxx because it only offers 0 slots"?

AFAIK no. It's sometimes not obvious whether it is because of used up slots or memory requests or alike.

-- Reuti


To unsubscribe from this discussion, e-mail: [users-unsubscribe at gridengine.sunsource.net].

More information about the gridengine-users mailing list