[GE users] threaded jobs (no PE) and consumable memory

reuti reuti at staff.uni-marburg.de
Thu Jan 14 14:33:39 GMT 2010


Am 14.01.2010 um 15:25 schrieb txema_heredia:

> Thanks for your answers
>
>> Hi,
>>
>> you might try to set an additional "-l h_stack=20M" (or whatever size
>> you need), as some threaded applications allocate the whole stack  
>> space
>> (which is in older versions of the SGE set equally to the h_vmem  
>> size -
>> in newer versions it is left as "unlimited", therefore avoiding this
>> problem).
>>
>> Hope it helps,
>> Sabine
>
> I have tried it alone (-l h_stack=4G), and the program runs OK. The  
> problem is that this attribute is not restrictive at all, so I have  
> used -l h_stack=50M, but the job is still able to use 4G without  
> receiving any termination signal (unlike if I use -l h_vmem=50M,  
> which aborts the job when exceeded).
>
> If combined -l h_stack and -l h_vmem, h_vmem takes preference and  
> the job is killed as usual (the malloc error and segment violation  
> thing).

You also tried it with different settings: -l  
h_stack=50M,h_vmem=4G ... ?

-- Reuti


>> Hi,
>>
>> Am 13.01.2010 um 18:31 schrieb txema_heredia:
>>
>>> Hi all,
>>>
>>> I've found a problem in SGE 6.1u4 regarding threaded jobs (without
>>> using PE) when submitted to a host with an h_vmem request:
>>>
>>> I want to run "blastall -a 8"  in my cluster (the -a allows the
>>> process to use N threads to run its analysis, but it doesn't
>>> require a parallel environment, it uses libpthread.so.0).
>>
>> a PE does not provide any parallelization functionality. It will just
>> tell SGE that this is a parallel job and make any preparation for the
>> parallel library you used for your job. If you just submit a serial
>> job and then use threads for the parallel tasks, SGE will overload a
>> node by putting too many jobs on it.
>>
>> You will need a PE which is often called 'smp' and keep the default
>> settings when you define them; then attach it to a queue. This will
>> also multiply the resource request and it might be necessary to
>> submit them with a lower value as it's meant as a consumption per  
>> task.
>>
>> --Reuti
>>
>
> I created the smp PE as you said, and it works well by itself, but  
> when combined with -l h_vmem, the problem remains as before, and  
> the failed malloc is still there (but multiplied by the number  
> given to -pe):
>
>
> -pe smp 4
> -l h_vmem=1G
> 4 threads
>
> mmap(NULL, 4294971392, PROT_READ|PROT_WRITE, MAP_PRIVATE| 
> MAP_ANONYMOUS|0x40, -1, 0) = -1 ENOMEM (Cannot allocate memory)
>
> 4294971392 = 4 * 1G
>
> ---------------------------------------------------------------------- 
> -----------------------
>
> -pe smp 8
> -l h_vmem=1G
> 8 threads
>
> mmap(NULL, 8589938688, PROT_READ|PROT_WRITE, MAP_PRIVATE| 
> MAP_ANONYMOUS|0x40, -1, 0) = -1 ENOMEM (Cannot allocate memory)
>
> 8589938688 = 8 * 1G
>
> ---------------------------------------------------------------------- 
> -----------------------
>
> -pe smp 8
> -l h_vmem=5G
> 8 threads
>
> mmap(NULL, 42949677056, PROT_READ|PROT_WRITE, MAP_PRIVATE| 
> MAP_ANONYMOUS|0x40, -1, 0) = -1 ENOMEM (Cannot allocate memory)
>
> 42949677056 = 8 * 5G
>
>
> so the problem is that it tries to allocate the maximum available  
> memory before creating threads whenever a memory request (h_vmem,  
> s_vmem) is given.
>
> I have tried to replicate this behaviour using ulimit -v, but it  
> worked correctly. What does the h_vmem really do to the job or even  
> linux???
>
> ------------------------------------------------------
> http://gridengine.sunsource.net/ds/viewMessage.do? 
> dsForumId=38&dsMessageId=238770
>
> To unsubscribe from this discussion, e-mail: [users- 
> unsubscribe at gridengine.sunsource.net].

------------------------------------------------------
http://gridengine.sunsource.net/ds/viewMessage.do?dsForumId=38&dsMessageId=238772

To unsubscribe from this discussion, e-mail: [users-unsubscribe at gridengine.sunsource.net].



More information about the gridengine-users mailing list