[GE users] qsub not finding lam

Margaret Doll Margaret_Doll at brown.edu
Mon Jun 4 18:58:19 BST 2007


I am having trouble with qsub working after I rebuilt mpich compilers.

Created lamhosts in my home directory.

recon -v lamhosts
n-1<4958> ssi:boot:base:linear: booting n0 (ted.chem.brown.edu)
n-1<4958> ssi:boot:base:linear: booting n1 (compute-0-0)
n-1<4958> ssi:boot:base:linear: booting n2 (compute-0-1)
n-1<4958> ssi:boot:base:linear: booting n3 (compute-0-2)
n-1<4958> ssi:boot:base:linear: booting n4 (compute-0-3)
n-1<4958> ssi:boot:base:linear: finished

lamboot -v lamhosts             was successful

tping -c1 N
   1 byte from 4 remote nodes and 1 local node: 0.001 secs

1 message, 1 byte (0.001K), 0.001 secs (1.366K/sec)
roundtrip min/avg/max: 0.001/0.001/0.001


mpif90 mpiprogram.f -o mpiprogram

where mpif90 is a compiler built based on Portland Group Compilers  
with the following options:

export CC=pgcc
export CXX=pgCC
export F77=pgf77
export F90=pgf90
./configure -c++=pgCC -cc=pgcc -fc=pgf77 \
 > -cflags="-Msignextend -tp px -DUSE_U_INT_FOR_XDR - 
DHAVE_RPC_RPC_H=1" \
 > -opt=-fast -fflages="-tp px" \
 > -c++flags="-tp px" -f90flags="-tp px" -f90=pgf90 --enable-f77 -- 
enable-f90modules
\--with-comm=shared


qsub -q test.q -pe mpich 8 ./mpishell
Your job 265 ("mpishell") has been submitted

where mpishell contains

#!/bin/bash
#$ -cwd
mpirun mpi -c 8  -machinefile /home/mad/machines /home/mad/freeman/ 
mpiprogram


and machines contains

compute-0-3
compute-0-3
compute-0-3
compute-0-3

compute-0-3 is contained in the test.q queue and has  8 slots or  
processors.

The program does not run when submitted with "qsub"


more mpishell.e266
------------------------------------------------------------------------ 
-----
It seems that there is no lamd running on the host compute-0-3.local.

This indicates that the LAM/MPI runtime environment is not operating.
The LAM/MPI runtime environment is necessary for the "mpirun" command.

Please run the "lamboot" command the start the LAM/MPI runtime
environment.  See the LAM/MPI documentation for how to invoke
"lamboot" across multiple machines.
------------------------------------------------------------------------ 
-----

lamd is running on the head node and the compute nodes.

mpirun works, but I need qsub to work.

mpirun -v -np 8 mpiprogram
running /home/mad/freeman/mpiprogram on 8 LINUX ch_p4 processors
Created /home/mad/freeman/PI5031
MY ID is            0 OUT OF            8 BELONGING TO GROUP
MY ID is            6 OUT OF            8 BELONGING TO GROUP
MY ID is            4 OUT OF            8 BELONGING TO GROUP
MY ID is            2 OUT OF            8 BELONGING TO GROUP
MY ID is            5 OUT OF            8 BELONGING TO GROUP
MY ID is            7 OUT OF            8 BELONGING TO GROUP
MY ID is            1 OUT OF            8 BELONGING TO GROUP
MY ID is            3 OUT OF            8 BELONGING TO GROUP







---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
For additional commands, e-mail: users-help at gridengine.sunsource.net




More information about the gridengine-users mailing list