[GE users] resource quota question
reuti at staff.uni-marburg.de
Wed May 19 15:17:38 BST 2010
Am 19.05.2010 um 15:52 schrieb rumpelkeks:
>>> Not sure - what are the exact implications? What it says on the box,
>>> exclusive access to that queue? So it would suspend all jobs in
>>> subordinate queues and give that qsub exclusive queue access? In which
>>> case I don't think. It is not supposed to use all nodes in the queue,
>>> just a subset. (I don't want to have a queue for every problem, really;
>>> I'm trying to avoid that.)
>> Aha, what about this: remove the subordination (hence fill both queues
>> (may an adjustment to any total slot count is also necessary).
>> When I got you right, the endless application won't generate any load all the time.
>> If it starts to generate load, you could use a suspend_threshold (for the user with
>> the endless job) to suspend itself when his load plus the one of the normal queue
>> exceeds a limit. If he is alone on the machine, his job will continue.
>> I think you have already two queues (a normal one and one for the special user) anyway.
> Interesting suggestion. Not sure it works, but would need to try. The
> special users application - the one that just keeps running - does
> create load all the time (and always uses all available slots, I
> believe, so at the moment no further jobs would be schedules onto them
> anyway (but that could be changed).
> What we have is four queues, actually. Bottom, Low, Medium, High. Bottom
> subordinate to everything, Low to Medium and High, Medium to High (you
> get the picture). Standard request submits to medium (got a time limit).
> Low is without a time limit. Bottom the one for the special user (I call
> him my background noise). High can only be used by our data acquisition
> software, as this must take precedence in whatever situation. The main
> requirement on the cluster is that whenever data is taken and some
> special data reduction software is run, this must be run instantly
> ('real time' data processing), across as much of the cluster as
> possible. So High goes and suspends everything else (pretty quickly).
> That's also what needed the 'exclusive' most.
> People are quite happy with that, it seems really. I was just (at the
> moment) trying to solve a problem of sharing between low and bottom
> queue, so trying to make a user of low 'share' resources with bottom -
> so not actually what we should be doing. I like the idea of getting rid
> of bottom and put the 'continuous jobs' user back into low, with
> appropriate thresholds set. That would work just fine, I guess. Thanks
> for setting my head right!
> Btw (although off-topic here) - I want to set up a test cluster so I can
> test even scheduler changes more freely in the future. Does that really
> require a second SGE installation, or is a second cell sufficient (i.e.
> is any configuration above cell level)?
a new cell is sufficient. It's just a shared SGE installation which don't know anything about others, but uses different ports for communication. Hence you will need to source the correct `settings.sh` from the cell you want to use for your SGE commands.
I even have two older machines set aside just for a mini cluster to test things.
> Tina Friedrich, Computer Systems Administrator, Diamond Light Source Ltd
> Diamond House, Harwell Science and Innovation Campus - 01235 77 8442
> To unsubscribe from this discussion, e-mail: [users-unsubscribe at gridengine.sunsource.net].
To unsubscribe from this discussion, e-mail: [users-unsubscribe at gridengine.sunsource.net].
More information about the gridengine-users