[GE users] picking the rank 0 MPI host in a tightly integrated openmpi setup

craffi dag at sonsorol.org
Thu Aug 6 19:41:22 BST 2009


I think I've traced my "can't scale beyond 700 CPUs" problem to an out- 
of-memory condition on the MPI Rank 0 node.

Basically the first node selected for the 700-way MPI job that  
marshals things seems to end up in a OOM situation above a certain  
scale.

I've got a mixture of 32GB and 128GB hosts. Is there a way to submit a  
tightly integrated openmpi job in a way where the rank 0 node will  
land on a large memory host while all the other ranks end up on any  
old type of machine?

I can't just ask for a hard request of 128GB because I've only got 64  
slots that would meet that criteria. All I want to do is let the first  
MPI host be one of the 128GB boxes and the remaining hosts can be the  
32GB systems.

Thought I saw reference to this in the past but I think it could be  
specific to the MPI implementation.

-Chris

------------------------------------------------------
http://gridengine.sunsource.net/ds/viewMessage.do?dsForumId=38&dsMessageId=211237

To unsubscribe from this discussion, e-mail: [users-unsubscribe at gridengine.sunsource.net].



More information about the gridengine-users mailing list