[GE users] Functional Policy & Job Sharing in a User Department

Andreas.Haas at Sun.COM Andreas.Haas at Sun.COM
Fri Jun 23 11:14:41 BST 2006

On Thu, 22 Jun 2006, Steve Pittard wrote:

> Hello,
> According to the SGE documentation, the functional policy does not remember
> past usage. However the manual indicates that the Functional setup does apply
> some preference:
> "The functional policy setup ensures that a defined share is guaranteed to 
> each
> user, project, or department at any time. Jobs of users, projects, or 
> departments
> that have used fewer resources than anticipated are preferred when the system 
> dispatches
> jobs to idle resources."
> (See 
> http://docs.sun.com/app/docs/doc/817-5677/6ml49n2bt?q=functional+policy&a=view 
> )
> Question #1 - By what method is the above "preference" determine/computed for 
> dispatch ?

It works based on ticket. E.g.

* configure a total number of 1.000.000 functional tickes
   in 'weight_tickets_functional' of sched_conf(5) 
* specify fshare of 25 in project(5) for 'projectA'
* specify fshare of 75 in project(5) for 'projectB'
* use 'true' for share_functional_shares in sched_conf(5)

if you then submit two series of jobs with "-P projectA" resp.
-P "projectB" the scheduler aims on a 25:75 ratio with job

If you're asking for details of ticket computation I must refer
you to libs/sched/sgeee.c in our sources.

> Question #2 - How long is it in effect ? That is how long before the 
> scheduler determines
> that the those users who have previously "used fewer resources" are now 
> caught up ?
> The above statement from the manual seems to suggest that the functional 
> policy does have
> some notion of who has been using the queue since it does ,at least for some 
> passage of time,
> favor newcomers to the queue.  Observation supports this.

With pure functional scheduling (weight_tickets_share = 0) the scheduler 
does not look on past resource usage. Decision making is based on current 
resource utilization only. Though a job that was started one hour ago and 
is still running plays it's role in the equation, yet only it's current 
resource allocation is considererd.

> Question #3 - Is there a way to impose a round robin effect wherein array 
> subtasks get
> pulled respectively, from user1, then user2, then user3 (they are all in the 
> same dept)
> and then repeat ?  The goal is to show users that with the functional policy 
> that one user
> within a department is not getting a better share than another.
> Let me be more specific. I've configured 3 departments
> each with 333 shares out of a possible 1,000. All users have 10 shares each.
> Now with respect to a single department.
> What I see is that when ,for example, user1 from DeptA submits an array job 
> to a
> fast.q. , his jobs are dutifully dispatched and begin to process. Well thats 
> good.
> So he runs for quite a while and has jobs pending.  Then along comes user2 
> from
> DeptA and his array submissions jump to the top of the pending list and some 
> of his jobs
> start making it onto the queue. He runs for a a while. Then user3 from deptA 
> comes along
> and his array jobs rise to the top and some of his jobs start making it onto 
> the queue.
> Okay thats fine. Now what I see, is user1's priority drop in the queue below 
> user2 and
> user3 - ostensibly because user2 and user3 haven't been using as many 
> resources lately
> as user1. Though at some point it would seem that user1 jobs would become 
> more or
> less equal in priority to user2 and user3 jobs.
> Is this correct ? If so how would one influence this equalization process ?

Well, you describe correct behaviour here.

> They are all in the same department of course so they are in effect sharing
> the 33% of the cluster resources (based on what I specified in the functional
> policy setup). What I want to be able to do is to explain to users of deptA
> the underlying mechanism that makes this happen. I explain to them that they 
> are
> sharing a share of the cluster resources but what they want to see is 
> something
> like a round robin approach wherein the scheduler takes one array subtask 
> from
> user1, then user2, and user3 - and then repeat. Note: I'm not saying this is
> desireable but they have questions about why their jobs appear to be lower in
> computed priority than other users in their department. Perhaps if they saw a
> typical round robin dispatch amongst their jobs they would be happier.
> After staring at qstat -ext output it does appear that over time the shares 
> within
> the department do get met -but when you are sitting down with a user showing 
> them
> how his jobs (and those of others) are moving about and getting serviced it 
> isn't so
> easy to explain. I've pointed out that the scheduler is working to make their 
> share
> allocations match but a reading of the functional policy seems to suggest 
> that allocations
> are more or less immediate with perhaps the exception of the initial 
> "preference" that the
> functional policy gives to those who have used fewer resources than 
> anticipated.

I agree a 100% round robin behaviour possibly were easier to communicate.
Let me try to explain it: The functional scheduler is not immediate in the 
sense, that assignments made within a scheduling interval do not cause the 
pending job list be immediately resorted as one possibly would expect. 
Reason is that doing the functional ticket calculation again after each 
assignment within a scheduling run and sorting the pending job list anew 
each time would cost quite a lot in terms scheduler cpu. Imagine cases 
where you have thousands of pending jobs.


To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
For additional commands, e-mail: users-help at gridengine.sunsource.net

More information about the gridengine-users mailing list