[GE users] Questions on functional policy, share tree policy, and priority queue
Dan.Templeton at Sun.COM
Tue May 27 19:34:27 BST 2008
[ The following text is in the "ISO-8859-1" character set. ]
[ Your display is set for the "ISO-8859-10" character set. ]
[ Some special characters may be displayed incorrectly. ]
I've added comments inline in the original email...
>>>> I'm new to Sun Grid Engine. I have it set up on a small Rocks cluster
>>>> running SGE 6.0 for the three owners of the company I work for,
>>>> but want a
>>>> more fair scheduling policy than the default FIFO policy. I want
>>>> to ensure
>>>> that all three owners get approximately equal access to the
>>>> cluster. All
>>>> jobs are single batch jobs or array batch jobs, in an embarrassingly
>>>> parallel situation. Each of the compute nodes has two quad-core
>>>> In particular, I want to ensure the following:
>>>> 1. If there is no contention for CPUs, a user should get access to
>>>> as many
>>>> CPUs as they need (equal to either the number of jobs they have
>>>> run or the
>>>> number of CPUs, whichever is less).
>>>> 2. If two or more users need to compete for CPUs, they should each
>>>> get a
>>>> "fair share" of CPU time.
>>>> 3. In case of urgent need, we should be able to run jobs that will
>>>> any currently running jobs.
>>>> It seems to me that I want to set up either a functional policy or
>>>> a share
>>>> tree policy. In addition, I want a queue that supersedes the
>>>> default all.q.
>>>> I have read through the documentation, and it looks like this is
>>>> what I
>>>> should set up:
>>>> * Add high-priority queue: qconf -aq priority.q
>>>> * Add all.q to priority.q subordinate list: qconf -mattr queue
>>>> subordinate_list all.q priority.q
>>>> * Set priority of priority.q to highest: qconf -mattr queue
>>>> priority -20
>>>> * In conf, set auto_user_fshare to 100
>>>> * Check that fshare is 100 for all users listed in qconf -suserl
>>>> * In sconf, set halftime to 48, compensation_factor to 5
>>>> * Add stree: id=0, name=default, type=0, shares=100, childnodes=NONE
>>>> * For functional policy: in sconf, set weight_tickets_functional
>>>> to 10000,
>>>> weight_tickets_share to 0
>>>> * For share tree policy: in sconf, set weight_tickets_functional to 0,
>>>> weight_tickets_share to 10000
You never want to have both a share tree policy and a function ticket
policy active at the same time. It would technically work, but the
complexity would make it very hard to manage. You'd never know exactly
what was causing the scheduling order. In your case, you just want to
use the functional policy.
>>>> Some specific questions I have on these settings are:
>>>> 1. Is there a possibility that a user may not be able to use all
>>>> CPUs on the cluster if there are enough jobs to use all CPUs?
Reuti misunderstood the question. The answer is, no. Neither the
functional nor share tree policy will block access to slots if there is
>>>> 2. Will the functional and share tree policy settings not
>>>> interfere with
>>>> each other as long as either weight_tickets_functional or
>>>> weight_tickets_share are 0?
>>> You could even set in the scheduler config, to use only one of the
>>> two in "policy_hierarchy" and leave one out.
>>>> 3. Will submitting jobs to priority.q always immediately cause at
>>>> least some
>>>> currently running jobs to be suspended? Will it cause all
>>>> currently running
>>> Well, in your setup the all.q instance on a node will be suspended
>>> when *all* slots of priority.q are filled on a particular machine.
>>> You could setup:
>>> and as soon as one slot is used in priority.q the complete queue
>>> instance on this machine is suspended. Possibly blocking 7 other
>>> jobs. It's not implemented for now to suspend slots instead of
>>> complete queue instances. What I would suggest: fill the cluster
>>> from the one side with all.q, and from the other with priority.q
>>> to minimize the effect by using sequence numbers and setting up
>>> the scheduler to sort queus by seqno instead of load.
To clarify what Reuti means, the scheduler has two modes of scheduling.
It can either first pick a host based on load and then a queue on that
host based on sequence number (the default), or it can first pick a
queue based on sequence number and then a host from the queue's host
list based on load. If you change the schedule to pick the queue first
(set queue_sort_method to seqno in the scheduler config), you can
control the order that the hosts are selected by setting the queue
instances' seq_no values. (The lowest sequence numbers will be filled
first.) You want to be careful, though, not to give every host's queue
instances a different sequence number, or it will fill up machines
instead of round-robining the jobs around the cluster.
>>>> jobs to be suspended if a user submits enough jobs?
>>>> 4. Does setting the -20 priority of priority.q have an effect?
>>> Don't do this! Nice values below zero are reserved for system
>>> tasks! For user jobs only use 0 (high) to 19 (low).
The reason Reuti says not to do that is because a job that runs at -20
nice value can seriously disrupt the functioning of the host on which it
runs, including the execution daemon, which runs at -10. If the
priority queue is used only for special purposes, a negative value might
be OK. If it's going to do any substantial amount of work, cutting it
off at 0 is a good idea.
Nonetheless, setting the queue priority has no actual effect on how the
scheduler order the jobs in the run queue. It only affects how the
operating system treats the job once it's scheduled.
>>>> 5. Are there any settings I have missed?
>>> To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
>>> For additional commands, e-mail: users-help at gridengine.sunsource.net
>> To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
>> For additional commands, e-mail: users-help at gridengine.sunsource.net
> To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
> For additional commands, e-mail: users-help at gridengine.sunsource.net
To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
For additional commands, e-mail: users-help at gridengine.sunsource.net
More information about the gridengine-users