[GE users] nodes overloaded: processes placed on already full nodes

reuti reuti at staff.uni-marburg.de
Wed Dec 15 14:16:49 GMT 2010


Hi,

Am 15.12.2010 um 14:58 schrieb steve_s:

> We're using SGE for a while now and are quite happy with it. 
> 
> However, lately we observed the following. We have a bunch of 8-core
> nodes connected by Infiniband and running MPI jobs across nodes. We found
> that processed often get placed on full nodes which have 8 MPI processes
> already running. This leaves us with many oversubscribed (load 16
> instead of 8) nodes. This happens although there are many empty nodes
> left in the queue. It is almost as if the slots already taken on one
> node are ignored by SGE. 

how many slots are defined in the queue definition, and how many queues do you have defined?

-- Reuti


> This is seen with OpenMPI and Intel MPI and with different applications.
> No applications does threading or anything that would create more
> processes than requested slots.
> 
> Did anybody have similar observations? We are thankful for any hints on
> how to debug this.
> 
> best,
> Steve
> 
> ------------------------------------------------------
> http://gridengine.sunsource.net/ds/viewMessage.do?dsForumId=38&dsMessageId=305816
> 
> To unsubscribe from this discussion, e-mail: [users-unsubscribe at gridengine.sunsource.net].

------------------------------------------------------
http://gridengine.sunsource.net/ds/viewMessage.do?dsForumId=38&dsMessageId=305818

To unsubscribe from this discussion, e-mail: [users-unsubscribe at gridengine.sunsource.net].



More information about the gridengine-users mailing list