[GE users] Qrsh qlogin with ssh on RedHat Linux

Lewis, Daniel (IS Consultant) DLewis at consultantemail.com
Wed May 14 21:56:12 BST 2008


SGE 6.1u3 Master on Solaris 10 with SunHPC 7.1
SGE 6.1u3 Exec on Linux (RHEL 5.1) with OpenMPI 1.2.5
OpenMPI 1.2.5 was compiled from source with the --with-sge flag,
ompi-info shows gridengine support.
linuxcluster.q contains only the Linux cluster hosts
Linux environment is a Beowulf-style cluster, two interfaces including
one on a subnet shared by the SGE master. 

In this SGE - OpenMPI mixed environment I'm trying to troubleshoot
process spawning from within an MPI-enabled application. I can exercise
"mpirun" all day without any issues, interactively, but spawning from
within the application results in "A daemon failed to start ..."

I've attempted to use ssh instead of rsh just in case that is a factor -
however, the directions for replacing rsh with ssh do not work on RedHat
linux - for one thing, the directions specify "sshd -i" so that sshd
will start via xinetd - on RH sshd is not started via xinet.d. In my
attempts to set that up, sgeexed segfaults on the target machine and I
get the indication that something is trying to start another sshd on
port 22. What is the proper value for rlogin daemon and rlogin command,
etc, for a Linux config?

"May 14 12:43:50 marl-clus1-cn08 sshd[3121]: error: Bind to port 22 on failed: Address already in use."

Note, ssh by itself works just fine throughout the environment.

Has anyone revised the directions at
http://gridengine.sunsource.net/howto/qrsh_qlogin_ssh.html to work for
RH Linux?

Dan L.

More information about the gridengine-users mailing list