[GE users] Large memory consumption of qrsh using builtin method

jlopez jlopez at cesga.es
Wed Feb 11 11:27:15 GMT 2009


Hi,

We have found that the qrsh processes using the builtin method are using 
more than 500MB per processes. This means that the memory consumption in 
the MASTER node increases rapidly when the number of slaves increases.

Here is an example:
18481 aurelio   15   0  519m 4128 3440 S    0  0.0   0:00.02 
qrsh                                                                    
18482 aurelio   15   0  519m 4128 3440 S    0  0.0   0:00.01 
qrsh                                                                    
18475 aurelio   15   0  519m 3968 3296 S    0  0.0   0:00.01 
qrsh                                                                    
18476 aurelio   15   0  519m 3968 3296 S    0  0.0   0:00.02 
qrsh                                                                    
18477 aurelio   15   0  519m 3968 3296 S    0  0.0   0:00.01 
qrsh                                                                    
18478 aurelio   15   0  519m 3968 3296 S    0  0.0   0:00.01 
qrsh                                                                    
18479 aurelio   15   0  519m 3968 3296 S    0  0.0   0:00.01 
qrsh                                                                    
18480 aurelio   15   0  519m 3968 3296 S    0  0.0   0:00.01 
qrsh                                                                    
18483 aurelio   15   0  519m 3968 3296 S    0  0.0   0:00.02 
qrsh                                                                    
18484 aurelio   15   0  519m 3968 3296 S    0  0.0   0:00.01 
qrsh                                                                    
18485 aurelio   15   0  519m 3968 3296 S    0  0.0   0:00.01 
qrsh                                                                    
18486 aurelio   15   0  519m 3968 3296 S    0  0.0   0:00.01 
qrsh                                                                    
18487 aurelio   15   0  519m 3968 3296 S    0  0.0   0:00.02 
qrsh                                                                    
18488 aurelio   15   0  519m 3968 3296 S    0  0.0   0:00.01 
qrsh                                                                    
18489 aurelio   15   0  519m 3968 3296 S    0  0.0   0:00.00 
qrsh           

And the same job resubmintted but using ssh to expand the processes:
19560 aurelio   16   0 12240 5152 3920 S    0  0.0   0:00.02 
ssh                                                                     
19561 aurelio   16   0 12240 5152 3920 S    0  0.0   0:00.02 
ssh                                                                     
19562 aurelio   16   0 12240 5152 3920 S    0  0.0   0:00.03 
ssh                                                                     
19563 aurelio   16   0 12240 5152 3920 S    0  0.0   0:00.03 
ssh                                                                     
19564 aurelio   16   0 12240 5152 3920 S    0  0.0   0:00.02 
ssh                                                                     
19565 aurelio   16   0 12240 5152 3920 S    0  0.0   0:00.02 
ssh                                                                     
19566 aurelio   16   0 12240 5152 3920 S    0  0.0   0:00.02 
ssh                                                                     
19567 aurelio   16   0 12240 5152 3920 S    0  0.0   0:00.02 
ssh                                                                     
19568 aurelio   16   0 12240 5152 3920 S    0  0.0   0:00.02 
ssh                                                                     
19569 aurelio   16   0 12240 5152 3920 S    0  0.0   0:00.01 
ssh                                                                     
19570 aurelio   16   0 12240 5152 3920 S    0  0.0   0:00.02 
ssh                                                                     
19571 aurelio   16   0 12240 5152 3920 S    0  0.0   0:00.03 
ssh                                                                     
19572 aurelio   15   0 12240 5152 3920 S    0  0.0   0:00.04 
ssh                                                                     
19573 aurelio   16   0 12240 5152 3920 S    0  0.0   0:00.03 
ssh                                                                     
19574 aurelio   16   0 12240 5152 3920 S    0  0.0   0:00.03 ssh  

As it can be seen in the first case the virtual memory consumed by the 
job is increased in 7GB.

I don't know if this could be a problem in the version of qrsh 
distributed with GE6.2u1 or if it is related to the use of the builtin 
communication. In GE6.1 we do not see this huge memory consumption of qrsh.

Cheers,
Javier

------------------------------------------------------
http://gridengine.sunsource.net/ds/viewMessage.do?dsForumId=38&dsMessageId=103401

To unsubscribe from this discussion, e-mail: [users-unsubscribe at gridengine.sunsource.net].

    [ Part 2, Text/X-VCARD (charset: UTF-8 "Internet-standard Unicode") ]
    [ (Name: "jlopez.vcf") 14 lines. ]
    [ Unable to print this part. ]



More information about the gridengine-users mailing list