[GE users] SGE/OpenMPI - all MPI tasks run only on a single node

k_clevenger kclevenger at coh.org
Wed Dec 16 23:30:37 GMT 2009


> > # which mpiexec
> > /opt/openmpi-1.3.3/bin/mpiexec
> >
> > # ls -l /opt/openmpi-1.3.3/bin/mpiexec
> > lrwxrwxrwx 1 root root 7 Nov  6 13:57 /opt/openmpi-1.3.3/bin/ 
> > mpiexec -> orterun
> >
> > # ldd /opt/openmpi-1.3.3/bin/orterun
> >   libopen-rte.so.0 => /opt/openmpi-1.3.3/lib/libopen-rte.so.0  
> > (0x00002aaaaaaad000)
> >   libopen-pal.so.0 => /opt/openmpi-1.3.3/lib/libopen-pal.so.0  
> > (0x00002aaaaacf4000)
> >   libdl.so.2 => /lib64/libdl.so.2 (0x0000003d2ec00000)
> >   libnsl.so.1 => /lib64/libnsl.so.1 (0x0000003d31c00000)
> >   libutil.so.1 => /lib64/libutil.so.1 (0x0000003d3b600000)
> >   libm.so.6 => /lib64/libm.so.6 (0x0000003d2f000000)
> >   libpthread.so.0 => /lib64/libpthread.so.0 (0x0000003d2f400000)
> >   libc.so.6 => /lib64/libc.so.6 (0x0000003d2e800000)
> >   /lib64/ld-linux-x86-64.so.2 (0x0000003d2e400000)
> 
> What is the output, when you test this inside a jobscript (and also a  
> ldd hello_c). Depending on the .bashrc, the paths could be different  
> inside a jobscript.

ldd /opt/openmpi-1.3.3/bin/orterun from within a job
  libopen-rte.so.0 => /opt/openmpi-1.3.3/lib/libopen-rte.so.0 (0x00002aaaaaaad000)
  libopen-pal.so.0 => /opt/openmpi-1.3.3/lib/libopen-pal.so.0 (0x00002aaaaacf4000)
  libdl.so.2 => /lib64/libdl.so.2 (0x0000003de3800000)
  libnsl.so.1 => /lib64/libnsl.so.1 (0x000000330c000000)
  libutil.so.1 => /lib64/libutil.so.1 (0x000000330a800000)
  libm.so.6 => /lib64/libm.so.6 (0x0000003de4800000)
  libpthread.so.0 => /lib64/libpthread.so.0 (0x0000003de3c00000)
  libc.so.6 => /lib64/libc.so.6 (0x0000003de3400000)
  /lib64/ld-linux-x86-64.so.2 (0x0000003de3000000)

ldd ./hello_c from within a job
  libmpi.so.0 => /opt/openmpi-1.3.3/lib/libmpi.so.0 (0x00002aaaaaaad000)
  libopen-rte.so.0 => /opt/openmpi-1.3.3/lib/libopen-rte.so.0 (0x00002aaaaad50000)
  libopen-pal.so.0 => /opt/openmpi-1.3.3/lib/libopen-pal.so.0 (0x00002aaaaaf97000)
  libdl.so.2 => /lib64/libdl.so.2 (0x0000003de3800000)
  libnsl.so.1 => /lib64/libnsl.so.1 (0x000000330c000000)
  libutil.so.1 => /lib64/libutil.so.1 (0x000000330a800000)
  libm.so.6 => /lib64/libm.so.6 (0x0000003de4800000)
  libpthread.so.0 => /lib64/libpthread.so.0 (0x0000003de3c00000)
  libc.so.6 => /lib64/libc.so.6 (0x0000003de3400000)
  /lib64/ld-linux-x86-64.so.2 (0x0000003de3000000)

> If you want to avoid dynamic binaries: I prefer to compile Open MPI  
> with --enabled-static --disable-shared
> 
 
The environment should be ok across the cluster, I have a standard .bashrc include that sets all the PATH/LD_LIBRARY_PATH/etc variables for all the apps. The relevant parts of the user environment looks like:

LANG=en_US.UTF-8
LD_LIBRARY_PATH=/opt/sge-6_2u4/lib/lx24-amd64:/opt/openmpi-1.3.3/lib:...
MPI_HOME=/opt/openmpi-1.3.3
OPENMPI_HOME=/opt/openmpi-1.3.3
PATH=/opt/sge-6_2u4/bin/lx24-amd64:/opt/openmpi-1.3.3/bin:...
SGE_CELL=default
SGE_CLUSTER_NAME=suncluster
SGE_EXECD_PORT=6445
SGE_QMASTER_PORT=6444
SGE_ROOT=/opt/sge-6_2u4
SHELL=/bin/bash
SHLVL=1
TMPDIR=/tmp

------------------------------------------------------
http://gridengine.sunsource.net/ds/viewMessage.do?dsForumId=38&dsMessageId=233826

To unsubscribe from this discussion, e-mail: [users-unsubscribe at gridengine.sunsource.net].



More information about the gridengine-users mailing list