[GE users] SGE and MVAPICH weirdness

jkeener jkeener at psc.edu
Tue Feb 17 17:49:20 GMT 2009


-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Hello, we have a cluster with infiniband and I'm trying to set up sge  
and mpi on it.  Here is a basic synopsis of the problem:
jkeener at xxxxx:~$ cat simple_mpi.sh
#!/bin/sh
#$ -pe mvapich 5
#$ -S /bin/sh

mpirun -np $NSLOTS -hostfile $TMP/machines /bin/date
jkeener at xxxxx:~$ qsub simple_mpi.sh
Your job 191 ("simple_mpi.sh") has been submitted
jkeener at xxxxx:~$ qstat
jkeener at xxxxx:~$ cat simple_mpi.sh.o191
cleanupjkeener at xxxxx:~$ cat simple_mpi.sh.e191

Child exited abnormally!
Killing remote processes...DONE
jkeener at xxxxx:~$ qrsh -pe mvapich 5 /bin/bash
/yyyyy/packages/mvapich2-1.2p1/bin/mpirun_rsh -ssh -np $NSLOTS - 
hostfile $TMP/machines /bin/date
Tue Feb 17 12:29:21 EST 2009
Tue Feb 17 12:28:27 EST 2009
Tue Feb 17 12:31:31 EST 2009
Tue Feb 17 12:30:35 EST 2009
Tue Feb 17 12:29:46 EST 2009

We have things set up as per http://gridengine.sunsource.net/howto/mvapich/MVAPICH_Integration.html 
.

Any help or hits would be greatly appreciated.

Jim
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.8 (Darwin)

iEYEARECAAYFAkma+KAACgkQV/izUbbBb00p0wCeMLfkZDqpjO12nyVCmO4pVXI/
vVkAn1GS4p0cCTLUJdiDt4QfgC7Nox8q
=Yc8r
-----END PGP SIGNATURE-----

------------------------------------------------------
http://gridengine.sunsource.net/ds/viewMessage.do?dsForumId=38&dsMessageId=108280

To unsubscribe from this discussion, e-mail: [users-unsubscribe at gridengine.sunsource.net].



More information about the gridengine-users mailing list