[GE users] process to core distribution
joseph.hargitai at nyu.edu
Fri Oct 17 15:12:53 BST 2008
----- Original Message -----
From: Reuti <reuti at staff.uni-marburg.de>
Date: Friday, October 17, 2008 9:37 am
Subject: Re: [GE users] process to core distribution
> Am 17.10.2008 um 12:48 schrieb Joseph Hargitai:
> > Running parallel jobs:
> > with openmpi the distribution of processes on 16 core nodes are
> > even when using a pe that allows 8 or 4 cores only. Meaning you get
> > 2 processes on each of the cores.
> Can you explain this in more detail? I don't get it.
We have a set of pe for openmpi 1x as follows: orte-4 orte-8 orte-12 - to allow less than all 16 cores to be used on our 16 core nodes. (4 quadcore cpus)
here is orte-8
When you run jobs with orte-8 16 - the 16 processes get distributed evenly on the nodes -
8 and 8. And on the node the processes also get evenly distributed: each of the 4 cpus get 2 processes.
Whereas with our mvapich configuration:
start_proc_args /opt/gridengine/mpi/startmpi.sh -catch_rsh $pe_hostfile
an mvapich-8 16 invocation would distribute 8 and 8 between the nodes as it should, but
the process distribution on the node itself, first two cpus get all 8 processes, nothing on the remaining two cpus on the node.
We are aware of the option to use numactl. Was just wondering what would cause the openmpi to do the mapping differently than mvapich.
ps - also, since you mentioned that possibility of cpus being already used by other jobs, which is not the case here. What is the SGE way to make nodes job exclusive the easiest way? We do not want parallel jobs to mix. They need to own the nodes when running even when not using all cores.
To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
For additional commands, e-mail: users-help at gridengine.sunsource.net
More information about the gridengine-users