[GE users] h_vmem as consumable and requesting too little memory

seb stark at tuebingen.mpg.de
Thu Oct 28 13:48:30 BST 2010


Am 27.10.2010 um 00:54 schrieb reuti:

> Am 26.10.2010 um 14:42 schrieb seb <stark at tuebingen.mpg.de>:
> 
>> We use h_vmem as a consumable with a 1G default. We set the upper  
>> h_vmem limit per host. Usually this makes sge behave like we want.  
>> However, there is a problem with a job like the following:
>> 
>> echo : | qsub -l h_vmem=1 -r y
>> 
>> What this job does is:
>> 
>> 1) get scheduled quickly because of low memory request
>> 
>> 2) set h_vmem limit to 1 byte (or is it 1k?)
> 
> Yep, the limit will be set to one byte.
> 
> 
>> 3) produce error message "can't set additional group id (uid=0,  
>> euid=0): Cannot allocate memory" (I think because it cannot even  
>> start sge_shepherd)
>> 
>> 4) make queue go into error state
>> 
>> 5) get rescheduled
>> 
>> Very quickly all queues go into error state.
>> 
>> Any idea how to prevent this from happening?
> 
> Does it also happen when you don't specify "-r y"?

Good point, but it also happens when I specify "-r n".

>> We use sge 6.2 beta2 and are willing to upgrade if this solves our  
>> problem, although I couldn't find anything related to this in the  
>> changelog.
>> 
>> Is it possible to require the user to request a _minimum_ of a  
>> resource?
> 
> There is an RFE to have it as an option in an RQS definition. For now  
> the solution would be to check the requested limit with JSV and adjust  
> it if necessary.


It sounds like JSV could help, but I need to upgrade first as it seems...

Thank you very much!


Sebastians

-- 
http://www.kyb.tuebingen.mpg.de/~stark
Max Planck Institute for Biological Cybernetics

------------------------------------------------------
http://gridengine.sunsource.net/ds/viewMessage.do?dsForumId=38&dsMessageId=290811

To unsubscribe from this discussion, e-mail: [users-unsubscribe at gridengine.sunsource.net].

    [ Part 2, Application/PKCS7-SIGNATURE (Name: "smime.p7s") 4.5 KB. ]
    [ Unable to print this part. ]



More information about the gridengine-users mailing list