[GE users] nodes overloaded: processes placed on already full nodes

reuti reuti at staff.uni-marburg.de
Wed Dec 15 14:16:49 GMT 2010


Am 15.12.2010 um 14:58 schrieb steve_s:

> We're using SGE for a while now and are quite happy with it. 
> However, lately we observed the following. We have a bunch of 8-core
> nodes connected by Infiniband and running MPI jobs across nodes. We found
> that processed often get placed on full nodes which have 8 MPI processes
> already running. This leaves us with many oversubscribed (load 16
> instead of 8) nodes. This happens although there are many empty nodes
> left in the queue. It is almost as if the slots already taken on one
> node are ignored by SGE. 

how many slots are defined in the queue definition, and how many queues do you have defined?

-- Reuti

> This is seen with OpenMPI and Intel MPI and with different applications.
> No applications does threading or anything that would create more
> processes than requested slots.
> Did anybody have similar observations? We are thankful for any hints on
> how to debug this.
> best,
> Steve
> ------------------------------------------------------
> http://gridengine.sunsource.net/ds/viewMessage.do?dsForumId=38&dsMessageId=305816
> To unsubscribe from this discussion, e-mail: [users-unsubscribe at gridengine.sunsource.net].


To unsubscribe from this discussion, e-mail: [users-unsubscribe at gridengine.sunsource.net].

More information about the gridengine-users mailing list