[GE users] Grid Engine vs. Java Jobs

Iwona Sakrejda isakrejda at lbl.gov
Wed Nov 30 23:46:54 GMT 2005


I sent the test code in a seperate e-mail.

I thougth the problem was that an application is killed by the SGE v_mem
memory limit, because SGE overcounts memory consumption by threads.

Could you explain why should I use a parallel environment for a threaded
application that performs reasonably well on one node? The only problem
I encounter is that SGE kills it, because SGE thinks the application is using
more memory than it in fact does.

Thanks  a lot,

Iwona

Reuti wrote:
> Hi,
> 
> Am 30.11.2005 um 23:29 schrieb Iwona Sakrejda:
> 
>> Hi,
>>
>> I read diligently through this thread and I wanted to add that we have
>> kernel version 2.4.21-27.0.2 and system version similar to reported
>> in this thread (SL 3), SGE 6.0u4 and we observe memory reporting  
>> problems with all
>> threaded applications.
>>
>> One of my colleagues wrote a simple program to test it and here are  
>> his conclusions:
>> "I believe that processes are what is schedulable under linux and  
>> each thread is reported as a separate process under this linux  
>> version.  However, linux is also reporting the total memory usage  
>> across all threads for each thread.  That is, after each new  thread, 
>> the memory usage reported for each thread increases by the  user's 
>> stack size ulimit.  So, summing across all threads which is  what I'm 
>> guessing SGE is doing over counts.  The right thing to do  is to 
>> determine which processes are actually part of a thread  group, and 
>> count memory once.  I don't know whether or not this  info is 
>> available from the kernel you're running but I'd think it's  worth a 
>> query to the SGE people in any case to see what if anything  they can 
>> do about it."
> 
> 
> I upgraded all our machines to SuSE 9.3, and tested with a small C  
> program using threads - and there is only one entry in the list of  
> processes. Can you please attach/paste the small program you used,  then 
> I could check this on 9.3?
> 
>> So could anything be done about this problem? I did not see any  
>> resolution of the
>> e-mail discussion.....
>>
>> For now I advise users having threaded applications to ask for huge  
>> amounts of memory
>> if they know what they are doing to overwrite my 1GB default, but  
>> that's awkward and dangerous...
> 
> 
> Shouldn't threaded applications be submitted to a PE  aynway, hence  SGE 
> will multiply the values on it's own. I was more concerned about  the 
> opposite effect (getting too high limits for threaded applications):
> 
> http://gridengine.sunsource.net/issues/show_bug.cgi?id=1254
> 
> -- Reuti
> 
>>
>> Thank you...
>>
>> Iwona
>>
>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
>> For additional commands, e-mail: users-help at gridengine.sunsource.net
> 
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
> For additional commands, e-mail: users-help at gridengine.sunsource.net
> 


---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
For additional commands, e-mail: users-help at gridengine.sunsource.net




More information about the gridengine-users mailing list