[GE users] Memory quotas

reuti reuti at staff.uni-marburg.de
Wed Feb 11 13:03:58 GMT 2009


Am 10.02.2009 um 12:09 schrieb davidecittaro:

> Hi Reuti
>
> On Feb 10, 2009, at 11:58 AM, reuti wrote:
>>
>>
>> the difference is, that h_vmem is enforced, while virtual_free is
>> just a guidance for SGE; as long as the users are fair and know what
>> their jobs consume. With h_vmem some jobs might crash, when they need
>> just one byte more. But over time these requests can be adjusted.
>>
>
> Ok, I see... and should it be set as complex value in execution  
> host configuration?
>
>>>
>>> $ qconf -srqs MemoryQuota
>>> {
>>>    name         MemoryQuota
>>>    description  Memory quota for users. Nobody can use more than 30
>>> Gb RAM
>>>    enabled      TRUE
>>>    limit        users {*}  to virtual_free=30G
>>> }
>>
>> But this is per user, not per host or job (i.e. slot). You have one
>> big SMP machine or a cluster of nodes?
>>
>
> Yes, this is the case. There are only a couple of users (including  
> me) that really use a lot of memory. We have 64 Gb RAM nodes (and  
> 24 processors each)....
> Talking about this, I've noticed that if a user sends qlogin,  
> qquota says that he is using 5 Gb RAM (I'm using qlogin_wrapper to  
> sshd). I wonder why this is happening...

Just for curiosity: did you solve this 5 GB issue for qlogin? For me  
the default from the complex definition is also taken for this command.

-- Reuti

>
>
>>> This, at least, allows me to keep jobs in qw state until the  
>>> quota is
>>> exceeded. Good. Also, since virtual_free is a load sensor it is
>>> reported to the quota even if it is not requested. Plus, if a user
>>> specifies -l virtual_free=X, his remaining quota is lowered by X.
>>> This seems to be a fair solution but I have some issues I suspect  
>>> are
>>> not easy to solve:
>>> - How can I handle users that run jobs that exceed quota while they
>>> are running? I mean, if an user submits a job that at a certain  
>>> point
>>> allocates for 50 Gb, it drains lot of the memory available
>>
>> Use h_vmem and these jobs will be killed. And you could specify
>> h_vmem as FORCED in the consumable configuration and/or set a high
>> default value (and user could lower them). You you implement this, it
>> could be useful to enable reservation in the scheduler and request
>> reservation with "-R y" in your qsub request.
>>
>
> Just a doubt: suppose I set h_vmem to 40 Gb, will a job be killed  
> if it goes over 40 Gb or every job is killed as the 40 Gb is reached?
>
>
>>> - I cannot set a suspend threshold for memory, as the memory
>>> referenced by a process can't be lowered while it is running (isn't
>>> it?)
>>
>> Correct. Also suspended jobs occupy resources they got granted.
>
> Yes, of course, but at least the load decreases :-) (I have suspend  
> thresholds only for load sensors)
>
> thanks again
>
> d
>
> Davide Cittaro
> davide.cittaro at ifom-ieo-campus.it
>
>
>

------------------------------------------------------
http://gridengine.sunsource.net/ds/viewMessage.do?dsForumId=38&dsMessageId=103420

To unsubscribe from this discussion, e-mail: [users-unsubscribe at gridengine.sunsource.net].



More information about the gridengine-users mailing list