[GE users] resource quota question

reuti reuti at staff.uni-marburg.de
Wed May 19 14:28:36 BST 2010


Am 19.05.2010 um 12:07 schrieb rumpelkeks:

> Hi,
> 
> reuti wrote:
>> Am 19.05.2010 um 10:37 schrieb rumpelkeks:
>> 
>>>>> <snip>
>>>>> the queues.
>>>> but they still request a PE for their jobs? Why do they assume that the nodes are blocked, they are just used.
>>> I know (and they know); it's just to them, they are blocked. Yes there's 
>>> a PE "smp" that they request (with as many slots as the node has CPUs). 
>>> (That is also, btw, how we do 'exclusive node access' - because the 
>>> exclusive complex doesn't suspend subordinate queues, and we need that).
>> 
>> it's possible to attach the exclusive complex to queues instead of the exechost, i.e. exclusive queue access. Does it help for your setup?
> 
> Not sure - what are the exact implications? What it says on the box, 
> exclusive access to that queue? So it would suspend all jobs in 
> subordinate queues and give that qsub exclusive queue access? In which 
> case I don't think. It is not supposed to use all nodes in the queue, 
> just a subset. (I don't want to have a queue for every problem, really; 
> I'm trying to avoid that.)

Aha, what about this: remove the subordination (hence fill both queues [may an adjustment to any total slot count is also necessary). When I got you right, the endless application won't generate any load all the time. If it starts to generate load, you could use a suspend_threshold (for the user with the endless job) to suspend itself when his load plus the one of the normal queue exceeds a limit. If he is alone on the machine, his job will continue.

I think you have already two queues (a normal one and one for the special user) anyway.

-- Reuti


> Tina
> 
>> -- Reuti
>> 
>> 
>>>>> So, I assume I could get around this by setting the scheduler policy to 
>>>>> "fill up", but I am not sure that we really want this (across the whole 
>>>>> cluster, that is).
>>>> 
>>>> 
>>>> With "fillup" you mean this:
>>>> 
>>>> http://blogs.sun.com/sgrell/entry/grid_engine_scheduler_hacks_least ?
>>> Yes.
>>> 
>>> Maybe if I describe what the problem is it helps.
>>> 
>>> What I have is (for this particular problem) two users. One's running a 
>>> whole bunch of standalone single CPU 'batch' jobs. The other has some 
>>> software that requires threading (can't do MPI) - and his jobs run 
>>> continuously. Meaning not he's got loads, but every single one just 
>>> never stops. Because they never stop, he's got his own queue that is 
>>> subordinate to all the others (otherwise no one else would ever get to 
>>> run anything).
>>> 
>>> So, basically, if the guy with the batch jobs comes in with a bunch 
>>> (they run for a couple of days each) the other guys jobs stop producing 
>>> data. And after a couple of days, he starts complaining. So the two of 
>>> them have asked if I can 'do some magic' so the batch jobs don't take up 
>>> the whole of the cluster... and I thought, well, a quota'd be easy. 
>>> Which it was. Only it doesn't help cos the batch jobs still use all the 
>>> nodes. So I'm trying to find a way around this. (Which can not involve 
>>> changing the scheduler, or global, config really. The earliest I could 
>>> do that is in mid June.)
>>> 
>>> Tina
>>> 
>>> -- 
>>> Tina Friedrich, Computer Systems Administrator, Diamond Light Source Ltd
>>> Diamond House, Harwell Science and Innovation Campus - 01235 77 8442
>>> 
>>> ------------------------------------------------------
>>> http://gridengine.sunsource.net/ds/viewMessage.do?dsForumId=38&dsMessageId=257834
>>> 
>>> To unsubscribe from this discussion, e-mail: [users-unsubscribe at gridengine.sunsource.net].
>> 
>> ------------------------------------------------------
>> http://gridengine.sunsource.net/ds/viewMessage.do?dsForumId=38&dsMessageId=257841
>> 
>> To unsubscribe from this discussion, e-mail: [users-unsubscribe at gridengine.sunsource.net].
>> 
> 
> 
> -- 
> Tina Friedrich, Computer Systems Administrator, Diamond Light Source Ltd
> Diamond House, Harwell Science and Innovation Campus - 01235 77 8442
> 
> ------------------------------------------------------
> http://gridengine.sunsource.net/ds/viewMessage.do?dsForumId=38&dsMessageId=257843
> 
> To unsubscribe from this discussion, e-mail: [users-unsubscribe at gridengine.sunsource.net].

------------------------------------------------------
http://gridengine.sunsource.net/ds/viewMessage.do?dsForumId=38&dsMessageId=257858

To unsubscribe from this discussion, e-mail: [users-unsubscribe at gridengine.sunsource.net].



More information about the gridengine-users mailing list