[GE users] Grid Engine vs. Java Jobs

Reuti reuti at staff.uni-marburg.de
Thu Dec 1 00:09:50 GMT 2005


Am 01.12.2005 um 00:46 schrieb Iwona Sakrejda:

> I sent the test code in a seperate e-mail.
>
> I thougth the problem was that an application is killed by the SGE  
> v_mem
> memory limit, because SGE overcounts memory consumption by threads.
>
> Could you explain why should I use a parallel environment for a  
> threaded
> application that performs reasonably well on one node? The only  
> problem
> I encounter is that SGE kills it, because SGE thinks the  
> application is using
> more memory than it in fact does.

Most likely you use threads to have a parallel job on a SMP machine.  
If SGE thinks of it as a serial job, the used slot will only be 1,  
although you might use 4 CPUs out of 32 in a node. So you need some  
other way to take care of it.

If you use a PE, the count of used slots will be corect this way -  
you request 4 CPUs, and SGE is aware of it.

Details how to set it up you can find here (although the topic is  
different):

http://gridengine.info/articles/2005/09/22/allowing-user-jobs-to-take- 
over-entire-nodes

Cheers - Reuti

PS: In some way it's a clean way: it's a parallel application, it has  
to request a PE.


> Thanks  a lot,
>
> Iwona
>
> Reuti wrote:
>> Hi,
>> Am 30.11.2005 um 23:29 schrieb Iwona Sakrejda:
>>> Hi,
>>>
>>> I read diligently through this thread and I wanted to add that we  
>>> have
>>> kernel version 2.4.21-27.0.2 and system version similar to reported
>>> in this thread (SL 3), SGE 6.0u4 and we observe memory reporting   
>>> problems with all
>>> threaded applications.
>>>
>>> One of my colleagues wrote a simple program to test it and here  
>>> are  his conclusions:
>>> "I believe that processes are what is schedulable under linux  
>>> and  each thread is reported as a separate process under this  
>>> linux  version.  However, linux is also reporting the total  
>>> memory usage  across all threads for each thread.  That is, after  
>>> each new  thread, the memory usage reported for each thread  
>>> increases by the  user's stack size ulimit.  So, summing across  
>>> all threads which is  what I'm guessing SGE is doing over  
>>> counts.  The right thing to do  is to determine which processes  
>>> are actually part of a thread  group, and count memory once.  I  
>>> don't know whether or not this  info is available from the kernel  
>>> you're running but I'd think it's  worth a query to the SGE  
>>> people in any case to see what if anything  they can do about it."
>> I upgraded all our machines to SuSE 9.3, and tested with a small  
>> C  program using threads - and there is only one entry in the list  
>> of  processes. Can you please attach/paste the small program you  
>> used,  then I could check this on 9.3?
>>> So could anything be done about this problem? I did not see any   
>>> resolution of the
>>> e-mail discussion.....
>>>
>>> For now I advise users having threaded applications to ask for  
>>> huge  amounts of memory
>>> if they know what they are doing to overwrite my 1GB default,  
>>> but  that's awkward and dangerous...
>> Shouldn't threaded applications be submitted to a PE  aynway,  
>> hence  SGE will multiply the values on it's own. I was more  
>> concerned about  the opposite effect (getting too high limits for  
>> threaded applications):
>> http://gridengine.sunsource.net/issues/show_bug.cgi?id=1254
>> -- Reuti
>>>
>>> Thank you...
>>>
>>> Iwona
>>>
>>>
>>>
>>> -------------------------------------------------------------------- 
>>> -
>>> To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
>>> For additional commands, e-mail: users-help at gridengine.sunsource.net
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
>> For additional commands, e-mail: users-help at gridengine.sunsource.net
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
> For additional commands, e-mail: users-help at gridengine.sunsource.net


---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
For additional commands, e-mail: users-help at gridengine.sunsource.net




More information about the gridengine-users mailing list