[GE users] Is there a way to reserve the memory for the process inste

Reuti reuti at staff.uni-marburg.de
Thu Apr 10 13:13:57 BST 2008


Hi,

Am 10.04.2008 um 13:33 schrieb Pacey, Mike:
> Missed this the first time round - I'm the author of the web page  
> you're
> citing, and I implemented mem_tokens to work around the same  
> problems as
> you're seeing. In my case, I have users who submit large job sets with
> large memory requirements.
>
> The mem_token complex is defined like this:
>
> #name               shortcut     type        relop requestable
> consumable default  urgency
> mem_token           mem_token    MEMORY      <=    YES         YES
> 0        0
>
> A number of memory tokens equivalent to an exec host's physical memory
> is then added to each exec host entry:
>
> complex_values        slots=4,mem_token=8G
>
> The system isn't perfect: it relies on users diligently submitting  
> valid
> memory resource requests, and as users aren't required to use  
> mem_token,
> large memory jobs needs to submitted with both mem_token and mem_free
> requests. Ideally, I'd like some way to set each host's job limit to a
> default value which can then be over-ridden from qsub, and to have the
> job abort if this limit is breached. It would seriously cut down on  
> the
> number of memory oversubscription problems I see on our cluster.

then use a) of my reply to the original question. With requesting  
h_vmem, the job will be killed. You can set a default value in the  
complex configuration (and make h_vmem consumable at this place), and  
a maximum a user can use per job in the queue definition. The set  
value in the exechost defintion is the memory installed in the  
machine, so that SGE's computation of the remaining memory is  
reliable. I.e., you have to set it at three locations: 1) complex  
definition the default per job, 2) queue definition maximum per job  
and 3) exechost definition total on a node.

-- Reuti


>
> Regards,
> Mike.
>
>> -----Original Message-----
>> From: Mulley, Nikhil [mailto:Nikhil.Mulley at deshaw.com]
>> Sent: 09 April 2008 16:56
>> To: users at gridengine.sunsource.net
>> Subject: RE: [GE users] Is there a way to reserve the memory for the
>> process inste
>>
>>>> We actually were so far suggesting to use 'mem_free' resource
>> attribute
>> that would ensure that the machines has enough memory before it gets
>> launched, but might not immediately be taking the mentioned memory.
>> Please read it as.. We were so far suggesting users to depend on
>> 'mem_free' attribute that would assure them their job getting  
>> launched
>> on the machine which meets the mem_free value, but this has caveat to
>> that the launched process might not be taking all of the asked for
>> memory immediately.
>>
>> -----Original Message-----
>> From: Mulley, Nikhil
>> Sent: Wednesday, April 09, 2008 9:21 PM
>> To: users at gridengine.sunsource.net
>> Subject: [GE users] Is there a way to reserve the memory for the
> process
>> inste
>>
>> SGE is v6.0.u11. One of the users had asked us recently if the memory
>> could be reserved by a process run via sge, that the memory may  
>> not be
>> used by the process but is allocated to in view of the sge. ?
>> This has put us to think of advanced reservations, but given the
> version
>> we are running we were not sure if there is anything such like could
> be
>> done.
>> We actually were so far suggesting to use 'mem_free' resource
> attribute
>> that would ensure that the machines has enough memory before it gets
>> launched, but might not immediately be taking the mentioned memory.
>> Sometimes there is another process which gets scheduled on the same
>> machine getting smitten by the mem_free attribute and might take up
> the
>> memory immediately what it has mentioned.
>> So, Is there a way to reserve the memory, instead of just checking  
>> for
>> free memory?
>>
>> He has even come up with this link where there is 'mem_token'  
>> resource
>> attribute is spoken about, but I am sure at the moment on how to go
>> about it and what does it take me to get this complex added.
>>
>> http://www.lancs.ac.uk/iss/hpc/advanced_jobs.html
>>
>> Thanks,
>> Nikhil
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
>> For additional commands, e-mail: users-help at gridengine.sunsource.net
>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
>> For additional commands, e-mail: users-help at gridengine.sunsource.net
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
> For additional commands, e-mail: users-help at gridengine.sunsource.net
>


---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
For additional commands, e-mail: users-help at gridengine.sunsource.net




More information about the gridengine-users mailing list