[GE users] Memory quotas

reuti reuti at staff.uni-marburg.de
Tue Feb 10 12:11:00 GMT 2009

Am 10.02.2009 um 12:09 schrieb davidecittaro:

> On Feb 10, 2009, at 11:58 AM, reuti wrote:
>> the difference is, that h_vmem is enforced, while virtual_free is
>> just a guidance for SGE; as long as the users are fair and know what
>> their jobs consume. With h_vmem some jobs might crash, when they need
>> just one byte more. But over time these requests can be adjusted.


> Ok, I see... and should it be set as complex value in execution  
> host configuration?

this can be set either per exechost as you suggest, or in another RQS:

limit hosts {*} to h_vmem=64GB

>>> $ qconf -srqs MemoryQuota
>>> {
>>>    name         MemoryQuota
>>>    description  Memory quota for users. Nobody can use more than 30
>>> Gb RAM
>>>    enabled      TRUE
>>>    limit        users {*}  to virtual_free=30G
>>> }
>> But this is per user, not per host or job (i.e. slot). You have one
>> big SMP machine or a cluster of nodes?
> Yes, this is the case. There are only a couple of users (including  
> me) that really use a lot of memory. We have 64 Gb RAM nodes (and  
> 24 processors each)....
> Talking about this, I've noticed that if a user sends qlogin,  
> qquota says that he is using 5 Gb RAM (I'm using qlogin_wrapper to  
> sshd). I wonder why this is happening...

Is there any default defined in `qconf -sc`?

>>> This, at least, allows me to keep jobs in qw state until the  
>>> quota is
>>> exceeded. Good. Also, since virtual_free is a load sensor it is
>>> reported to the quota even if it is not requested. Plus, if a user
>>> specifies -l virtual_free=X, his remaining quota is lowered by X.
>>> This seems to be a fair solution but I have some issues I suspect  
>>> are
>>> not easy to solve:
>>> - How can I handle users that run jobs that exceed quota while they
>>> are running? I mean, if an user submits a job that at a certain  
>>> point
>>> allocates for 50 Gb, it drains lot of the memory available
>> Use h_vmem and these jobs will be killed. And you could specify
>> h_vmem as FORCED in the consumable configuration and/or set a high
>> default value (and user could lower them). You you implement this, it
>> could be useful to enable reservation in the scheduler and request
>> reservation with "-R y" in your qsub request.
> Just a doubt: suppose I set h_vmem to 40 Gb, will a job be killed  
> if it goes over 40 Gb or every job is killed as the 40 Gb is reached?

h_vmem limited in an RQS won't be enforced. It has to be requested  
per job (or the default from `qconf -mc`applies) and this individual  
resource request (per job) will be enforced.

-- Reuti

>>> - I cannot set a suspend threshold for memory, as the memory
>>> referenced by a process can't be lowered while it is running (isn't
>>> it?)
>> Correct. Also suspended jobs occupy resources they got granted.
> Yes, of course, but at least the load decreases :-) (I have suspend  
> thresholds only for load sensors)
> thanks again
> d
> Davide Cittaro
> davide.cittaro at ifom-ieo-campus.it


To unsubscribe from this discussion, e-mail: [users-unsubscribe at gridengine.sunsource.net].

More information about the gridengine-users mailing list