[GE users] how to throttle jobs into a queue

Andreas.Haas at Sun.COM Andreas.Haas at Sun.COM
Fri Aug 24 18:39:06 BST 2007


On Fri, 24 Aug 2007, david zanella wrote:

>
> I agree that this will probably work, but it isn't exactly what I"m looking
> for.
>
> In my case, the users are submitting several thousand jobs at a time. They
> cannot predict (or don't want to take the time to) how much memory a job will
> use. If they flag each job as using 2G of memory, then the consumable resource
> will run out at 15 or 16 jobs. Using my current load thresholds, I'm getting
> 22-27 jobs on each server. I lose a lot of throughput if I do this.

Couldn't you use a statistical average value either for memory 
consumption or number of jobs that can run concurrently? That
way you would still have cases where memory is over/underutilized 
but you would approximate to ideal memory utilization.

If that doesn't work another (dirty) approach would be to ensure
that the start of these memory hungry applications is retained when

    $SGE_ROOT/utilbin/<os-arch>/loadcheck

reports high water situation with memory resource local at the execution
node. With such a mechanism there might still be cases where memory is 
flooded, but race condition would be certainly improved.

Note, instead of retaining the job start rescheduling could also be triggered 
as variation by letting jobs (or the prolog script) exit with 99 (see under 
FORBID_RESCHEDULE in sge_conf(5)). Continuous job rescheduling can be prevented
by doing a sleep before the exit 99.

Regards,
Andreas

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
For additional commands, e-mail: users-help at gridengine.sunsource.net




More information about the gridengine-users mailing list