[GE users] how to throttle jobs into a queue

david zanella zanella at mayo.edu
Fri Aug 24 20:37:38 BST 2007


I'm using 6.1beta (I think). I picked it up just days before the official 
release, so I suspect it might be the offical 6.1 version (at least that's what 
is says in messages:
 
07/31/2007 09:38:28|schedd|hsrnfs-101|I|starting up N1GE 6.1 (sol-sparc64)

At any rate, I did some more googling and found this:

http://bioinformatics.org/pipermail/bioclusters/2004-December/002146.html

Which gave me the idea to totally redline np_load_avg. qconf -msconf:

job_load_adjustments              np_load_avg=100
load_adjustment_decay_time        00:15:00

It works!

qstat -j <jobnum> says:

queue instance "cc32 at crush.mayo.edu" dropped because it is overloaded:
np_load_avg=2.031982 (= 0.125732 + 100 * 0.610000 with nproc=1) >= 1.75

So, the documentation isn't all that clear. I figured that it "bumped"
the np_load_avg by the job_load_adjustments amount. Instead, it put
that value into the np_load_avg equation, THEN divided by the number of
CPU's.

Based on your note below, it's very likely that the calculation was off
in 6.1 and fixed in u2. I'll get u2 downloaded and see if there is any
difference in the calculation.

I don't care either way...I've got the cluster to throttle incoming
jobs, so I'm happy.


> I installed a 6.1u2 cluster, and load adjustments appear to work again.  
> I also went through the internal issue tracker, and I can't find any 
> mention of this issue, so it must have been silently (or accidentally) 
> fixed with u1.  (u2 is a *very* minor release).  A clean install of 6.1 
> has the problem, and a clean install of 6.1u2 does not.
> 
> Daniel
> 
> Daniel Templeton wrote:
> > David,
> >
> > Are you using 6.1?  I just tried the same thing with my 6.1 cluster, 
> > and it also had no effect.  I tried the same thing with my 6.0u10 
> > cluster and it worked.  I'm now downloading the latest 6.1u2 binaries, 
> > to try it there as well.  I don't see an issue listed for the problem, 
> > but it may have been fixed in an update release nonetheless.
> >
> > Daniel
> >
> > david zanella wrote:
> >> I agree that this will probably work, but it isn't exactly what I"m 
> >> looking for.
> >> In my case, the users are submitting several thousand jobs at a time. 
> >> They cannot predict (or don't want to take the time to) how much 
> >> memory a job will use. If they flag each job as using 2G of memory, 
> >> then the consumable resource will run out at 15 or 16 jobs. Using my 
> >> current load thresholds, I'm getting 22-27 jobs on each server. I 
> >> lose a lot of throughput if I do this.
> >> Using qconf -msconf and changing job_load_adjustments from
> >> np_load_avg=0.5 to np_load_avg=2.0 with a load_adjustment_decay_time of
> >> 15 minutes *SHOULD* do it (man sched_conf)...but it doesn't seem to be
> >> having any affect. That is, upon each job submission, it should
> >> artificially increase the np_load_avg to 2.0 (alarm is set at 1.75) and
> >> then decay that setting down for 15 minutes. That should give the job
> >> enough time to ramp up and start using memory and trip my memory and
> >> swap triggers.
> >>
> >>
> >>
> >> ------------- Begin Forwarded Message -------------
> >>
> >> From: "Kogan, Felix" <Felix-Kogan at deshaw.com>
> >> To: <users at gridengine.sunsource.net>
> >> Subject: RE: [GE users] how to throttle jobs into a queue
> >> Content-Transfer-Encoding: 8bit
> >> X-MIME-Autoconverted: from quoted-printable to 8bit by 
> >> hsrnfs-101.mayo.edu id l7OG83F27145
> >>
> >> I've had the same problem and came up with the following solution (still
> >> in testing phase):
> >>
> >> o Make mem_free a requestable and consumable attribute
> >>
> >>     $ qconf -sc
> >>     #name                                    shortcut
> >> type        relop requestable consumable default  urgency
> >>     
> >> #-----------------------------------------------------------------------
> >> --------------------------------------------------------------------
> >>     ...
> >>     mem_free                                 mf
> >> MEMORY      <=    YES         YES        0        0
> >>     ...
> >>
> >> o Set the resource value to the real amount of RAM for each node
> >>  
> >>     qconf -mattr exechost complex_values mem_free=32G
> >> hostname.foo.bar.com
> >>
> >> Once this is done, users can use "-l mem_free=2G" to really reserve 2GB
> >> of RAM. Mem_free reading of the host where this job is executed will
> >> show 2GB less mem_free. If the job, in fact consumed 2.5GB, mem_free
> >> will reflect that. I.e. SGE uses the smaller of two values - calculated
> >> from internal accounting and received from the load sensor. This works
> >> for all other standard or custom requestable and consumable attributes,
> >> as long as custom load sensor is set for these (e.g. you can set this up
> >> for /var/tmp space).
> >>
> >>
> >> Hope that helps.
> >>
> >> -- 
> >> Felix Kogan
> >>
> >> -----Original Message-----
> >> From: david zanella [mailto:zanella at mayo.edu] Sent: Friday, August 
> >> 24, 2007 11:46 AM
> >> To: users at gridengine.sunsource.net
> >> Subject: [GE users] how to throttle jobs into a queue
> >>
> >>
> >> I have a group of users that are submitting jobs to my grid.  The jobs
> >> do some sort of pedigree/chromosome calculations. It is impossible for
> >> the user to predict or control the amount of memory for each job.
> >> Consequently, some job will start out small and grow to be about 2G in
> >> size and run for weeks, other jobs can be small as a few hundred meg
> >> and finish up in an hour.
> >>
> >> I have set up load thresholds that will suspend job submission if the
> >> available mem_free < 2G or swap_used > 6G.  For the most part, this
> >> works well.  I have 7 T2000's for execute hosts.
> >>
> >> Here's the problem:
> >>
> >> My T2000's have 32G of memory and I have 30 slots for each. With the
> >> load thresholds in place, say the server is only running 20 jobs. A job
> >> completes and the server is now below it's load threshold. The qmaster
> >> sees this and immediately shoves 11 jobs at the server.  Pretty soon,
> >> the jobs grow, I run out of memory and swap, and jobs start crashing.
> >>
> >> What I need is some way to throttle the acceptance rate to the server.
> >> To tell the server to accept one job, then re-evaluate in, say, 15 or
> >> 30 minutes. If the load thresholds give a green light, it'll accept
> >> another job.
> >>
> >> I've looked at sched_conf, and it has what appears to be what I need.
> >> I've made various adjustments to job_load_adjustments and
> >> load_adjustment_decay_time, but these have not had any effect.
> >>
> >> Am I missing something? Is there a better way to accomplish what I'm
> >> trying to do?
> >>
> >> ---------------------------------------------------------------------
> >> To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
> >> For additional commands, e-mail: users-help at gridengine.sunsource.net
> >>
> >>
> >> ------------- End Forwarded Message -------------
> >>
> >> ---------------------------------------------------------------------
> >> To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
> >> For additional commands, e-mail: users-help at gridengine.sunsource.net
> >>
> >>   
> >
> >
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
> For additional commands, e-mail: users-help at gridengine.sunsource.net

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
For additional commands, e-mail: users-help at gridengine.sunsource.net




More information about the gridengine-users mailing list