[GE users] Does nice always work when determining which waiti ng job to assign to a node first?

Reuti reuti at staff.uni-marburg.de
Tue Apr 12 21:43:16 BST 2005


    [ The following text is in the "ISO-8859-1" character set. ]
    [ Your display is set for the "ISO-8859-10" character set.  ]
    [ Some special characters may be displayed incorrectly. ]

Jim,

I remember the subordination from the "Running 1, 2, 3, or 4 jobs..." thread. 
But this now looks like a different setup. 

Quoting "Marconnet, James E Mr /Computer Sciences Corporation" 
<james.marconnet at smdc.army.mil>:

> Reuti:
> 
> I'm have two technical groups, each with what we call a primary que and a
> secondary que. So call the ques G1p, G1s, G2p, G2s. The nodes are split
> evenly between the two groups. G1p contains the same nodes as G2s, except
> with nice=0 or 19 respectively. And so forth. The idea was to allow both

When always one queue is blocked via the subordination, then there is no 
difference whether a single job on machine will run with nice=0 or nice=19 - it 
will get all of the CPU time.

> groups to use as many available nodes as possible, without wasting half the
> nodes when no one in the other group needs to run anything. 
> 
> To prevent running 2 or more jobs on nodes simultaneously, G1p is
> subordinate to G2s, and so forth, per your suggestion in another thread.
> 
> We thought that setting nice would affect the scheduling of waiting jobs to
> nodes. Turns out it does not. 

Using the nice values could achieve the requested effect in combination with 
the seq_no:

- you setup already two user lists, so that each group can only run in queues 
they should use I assume.

- que G1p:

hostlist @partA
priority 0
seq_no 50

- que G1s:

hostlist @partB
priority 19
seq_no 100

- que G2p:

hostlist @partB
priority 0
seq_no 50

- que G2s:

hostlist @partA
priority 19
seq_no 100


Setting the sort order to seqno will still use load balancing between queue 
instances with the same seq_no.

- Jobs submitted by G1 will first fill the machines in the que G1p, if they 
submit more, than G1s will be used.

- Now G2 submit some jobs, they will go first to the @partB nodes, and select 
the one with the least load.

- If G2 get now some nodes where already a G1s job is running, they will get 
most of the CPU time due to the different nices value for the two ques on the 
same machines. Of course, this maybe a point of discussion, whether your users 
will grant this always to the other group. (only options I see: suspend the G1s 
job [what you don't like], or wait until the G1s job finished, i.e. drain the 
G1s slot *)

- Each group can request their primary que in qsub, if they don't like to run 
in the background.

Will this come close the request of your groups? - Reuti


*) This currently you want to achieve with the blocking via the subordination I 
think, but depending on the policy, the schduler will first grant a slot to a 
member of the wrong group after it finished. Something like "never use G1s, if 
G2 ones are waiting" - independend from any other policy - would help. I'm 
thinking of a load_sensor, which will block G1s if the count of pending G2 jobs 
is > 0. Then of course you wouldn't need any priority/nice values at all.

> 
> So if G1 submits a gazillion jobs to both his group's primary and secondary
> ques, then all the cluster nodes get used very efficiently, but nobody in
> G2
> gets to start even one job till all the G1 jobs begin and then at least one
> job finishes.
> 
> Yes, we have load threshholds set (since there is some interactive testing
> and since some nodes are dual-processor, hyperthreaded - another story!),
> but we are not currently suspending jobs. Some users have short jobs, and
> some have much longer jobs. So suspending jobs seemed iffy.
> 
> Probably clear as mud, but hope it helps someone understand and suggest an
> approach.
> 
> Jim
> 
> -----Original Message-----
> From: Reuti [mailto:reuti at staff.uni-marburg.de] 
> Sent: Tuesday, April 12, 2005 1:36 PM
> To: users at gridengine.sunsource.net
> Subject: RE: [GE users] Does nice always work when determining which waiti
> ng job to assign to a node first?
> 
> Hi Jim,
> 
> I'm still not sure about your intended setup. You have two cluster queues -
> one with nice=0, the other with nice=19 and any load_thresholds and
> subordination?
> 
> One user will submit qsub -q nice0que, the other qsub -q nice19que?
> 
> Quoting "Marconnet, James E Mr /Computer Sciences Corporation" 
> <james.marconnet at smdc.army.mil>:
> 
> > Bummer, we thought the nice value would affect the order in which 
> > waiting jobs were assigned to the nodes. Apparently not so.
> > 
> > I searched the Admin Manual on seq_no, and I did not see where that 
> > could be used unless we wanted to give up sorting by load level to 
> > balance out the load on the nodes instead of filling up the first node 
> > completely, then the next one fully, etc. And it's not at all clear 
> > how this would be used anyway. Anyone able to clarify it?
> > 
> > Reading from the Admin manual: 
> > Without any administrator influence, the order is first-in-first-out 
> > (FIFO).
> > 
> > The administrator has the following means to control the job order:
> > ^A Ticket-based job priority. ....
> > ^A Urgency-based job priority. ....
> > ^A POSIX priority.....
> > 
> > Is there an easy way to tie one of these methods to the que which was 
> > specified? I don't want the user to have to specify additional options 
> > (that I have to explain and to police) other than the que if it can be 
> > helped.
> > 
> > And we'd prefer not to suspend jobs, but to let the running jobs 
> > complete before starting new jobs. Suspending jobs would wreck havoc 
> > on our completion predictions.
> 
> But you mentioned subordinated queues - they will be suspended then.
> 
> CU - Reuti
> 
> > 
> > Perhaps I just want too much!?
> > 
> > Thanks,
> > Jim
> > 
> > -----Original Message-----
> > From: Reuti [mailto:reuti at staff.uni-marburg.de]
> > Sent: Tuesday, April 12, 2005 10:40 AM
> > To: users at gridengine.sunsource.net
> > Subject: Re: [GE users] Does nice always work when determining which 
> > waiting job to assign to a node first?
> > 
> > Hi,
> > 
> > nice is not used for scheduling, but you can use a seq_no for the two 
> > queue types, to fill first the nice=0 queue. But suspending a nice=19 
> > queue - mhh then this queue could also have just nice of 0, as it's 
> > suspended anyway, if the nice=0 queue is filled (if I got you 
> > correctly).
> > 
> > CU - Reuti
> > 
> > 
> > Marconnet, James E Mr /Computer Sciences Corporation wrote:
> > > Using 6.0u3. Had some reports yesterday that some waiting jobs from 
> > > a que with nice=0 were still waiting after some waiting jobs from a 
> > > different que with nice=19 started running on nodes previously 
> > > running jobs with nice=19. The "wronged" users figuratively went on 
> > > the warpath soon afterwards.
> > > 
> > > We are using que subordination to prevent too many jobs from 
> > > different ques from running on the same nodes at the same time, but 
> > > that works on a node-by-node basis, and it seemed to be working OK.
> > > 
> > > Anything In particular I should know about this or to look for 
> > > settings-wise?
> > > 
> > > Thanks!
> > > Jim Marconnet
> > > 
> > 
> > 
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
> > For additional commands, e-mail: users-help at gridengine.sunsource.net
> > 
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
> > For additional commands, e-mail: users-help at gridengine.sunsource.net
> > 
> 
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
> For additional commands, e-mail: users-help at gridengine.sunsource.net
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
> For additional commands, e-mail: users-help at gridengine.sunsource.net
> 



---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
For additional commands, e-mail: users-help at gridengine.sunsource.net




More information about the gridengine-users mailing list