[GE users] resource quota question

reuti reuti at staff.uni-marburg.de
Wed May 19 15:17:38 BST 2010


Am 19.05.2010 um 15:52 schrieb rumpelkeks:

> Hi,
> <snip>
>>> Not sure - what are the exact implications? What it says on the box, 
>>> exclusive access to that queue? So it would suspend all jobs in 
>>> subordinate queues and give that qsub exclusive queue access? In which 
>>> case I don't think. It is not supposed to use all nodes in the queue, 
>>> just a subset. (I don't want to have a queue for every problem, really; 
>>> I'm trying to avoid that.)
>> Aha, what about this: remove the subordination (hence fill both queues 
>> (may an adjustment to any total slot count is also necessary).
>> When I got you right, the endless application won't generate any load all the time. 
>> If it starts to generate load, you could use a suspend_threshold (for the user with 
>> the endless job) to suspend itself when his load plus the one of the normal queue 
>> exceeds a limit. If he is alone on the machine, his job will continue.
>> I think you have already two queues (a normal one and one for the special user) anyway.
> </snip>
> Interesting suggestion. Not sure it works, but would need to try. The 
> special users application - the one that just keeps running - does 
> create load all the time (and always uses all available slots, I 
> believe, so at the moment no further jobs would be schedules onto them 
> anyway (but that could be changed).
> What we have is four queues, actually. Bottom, Low, Medium, High. Bottom 
> subordinate to everything, Low to Medium and High, Medium to High (you 
> get the picture). Standard request submits to medium (got a time limit). 
> Low is without a time limit. Bottom the one for the special user (I call 
> him my background noise). High can only be used by our data acquisition 
> software, as this must take precedence in whatever situation. The main 
> requirement on the cluster is that whenever data is taken and some 
> special data reduction software is run, this must be run instantly 
> ('real time' data processing), across as much of the cluster as 
> possible. So High goes and suspends everything else (pretty quickly). 
> That's also what needed the 'exclusive' most.
> People are quite happy with that, it seems really. I was just (at the 
> moment) trying to solve a problem of sharing between low and bottom 
> queue, so trying to make a user of low 'share' resources with bottom - 
> so not actually what we should be doing. I like the idea of getting rid 
> of bottom and put the 'continuous jobs' user back into low, with 
> appropriate thresholds set. That would work just fine, I guess. Thanks 
> for setting my head right!
> Btw (although off-topic here) - I want to set up a test cluster so I can 
> test even scheduler changes more freely in the future. Does that really 
> require a second SGE installation, or is a second cell sufficient (i.e. 
> is any configuration above cell level)?

a new cell is sufficient. It's just a shared SGE installation which don't know anything about others, but uses different ports for communication. Hence you will need to source the correct `settings.sh` from the cell you want to use for your SGE commands.

I even have two older machines set aside just for a mini cluster to test things.

-- Reuti

> -- 
> Tina Friedrich, Computer Systems Administrator, Diamond Light Source Ltd
> Diamond House, Harwell Science and Innovation Campus - 01235 77 8442
> ------------------------------------------------------
> http://gridengine.sunsource.net/ds/viewMessage.do?dsForumId=38&dsMessageId=257861
> To unsubscribe from this discussion, e-mail: [users-unsubscribe at gridengine.sunsource.net].


To unsubscribe from this discussion, e-mail: [users-unsubscribe at gridengine.sunsource.net].

More information about the gridengine-users mailing list