[GE users] resource quota question
tina.friedrich at diamond.ac.uk
Wed May 19 14:52:14 BST 2010
>> Not sure - what are the exact implications? What it says on the box,
>> exclusive access to that queue? So it would suspend all jobs in
>> subordinate queues and give that qsub exclusive queue access? In which
>> case I don't think. It is not supposed to use all nodes in the queue,
>> just a subset. (I don't want to have a queue for every problem, really;
>> I'm trying to avoid that.)
> Aha, what about this: remove the subordination (hence fill both queues
> (may an adjustment to any total slot count is also necessary).
> When I got you right, the endless application won't generate any load all the time.
> If it starts to generate load, you could use a suspend_threshold (for the user with
> the endless job) to suspend itself when his load plus the one of the normal queue
> exceeds a limit. If he is alone on the machine, his job will continue.
> I think you have already two queues (a normal one and one for the special user) anyway.
Interesting suggestion. Not sure it works, but would need to try. The
special users application - the one that just keeps running - does
create load all the time (and always uses all available slots, I
believe, so at the moment no further jobs would be schedules onto them
anyway (but that could be changed).
What we have is four queues, actually. Bottom, Low, Medium, High. Bottom
subordinate to everything, Low to Medium and High, Medium to High (you
get the picture). Standard request submits to medium (got a time limit).
Low is without a time limit. Bottom the one for the special user (I call
him my background noise). High can only be used by our data acquisition
software, as this must take precedence in whatever situation. The main
requirement on the cluster is that whenever data is taken and some
special data reduction software is run, this must be run instantly
('real time' data processing), across as much of the cluster as
possible. So High goes and suspends everything else (pretty quickly).
That's also what needed the 'exclusive' most.
People are quite happy with that, it seems really. I was just (at the
moment) trying to solve a problem of sharing between low and bottom
queue, so trying to make a user of low 'share' resources with bottom -
so not actually what we should be doing. I like the idea of getting rid
of bottom and put the 'continuous jobs' user back into low, with
appropriate thresholds set. That would work just fine, I guess. Thanks
for setting my head right!
Btw (although off-topic here) - I want to set up a test cluster so I can
test even scheduler changes more freely in the future. Does that really
require a second SGE installation, or is a second cell sufficient (i.e.
is any configuration above cell level)?
Tina Friedrich, Computer Systems Administrator, Diamond Light Source Ltd
Diamond House, Harwell Science and Innovation Campus - 01235 77 8442
To unsubscribe from this discussion, e-mail: [users-unsubscribe at gridengine.sunsource.net].
More information about the gridengine-users