[GE users] Qrsh qlogin with ssh on RedHat Linux

Reuti reuti at staff.uni-marburg.de
Thu May 15 08:18:29 BST 2008


    [ The following text is in the "WINDOWS-1252" character set. ]
    [ Your display is set for the "ISO-8859-10" character set.  ]
    [ Some special characters may be displayed incorrectly. ]

Hi,

Am 14.05.2008 um 22:56 schrieb Lewis, Daniel ((IS Consultant)):
> SGE 6.1u3 Master on Solaris 10 with SunHPC 7.1
> SGE 6.1u3 Exec on Linux (RHEL 5.1) with OpenMPI 1.2.5
> OpenMPI 1.2.5 was compiled from source with the --with-sge flag,  
> ompi-info shows gridengine support.
> linuxcluster.q contains only the Linux cluster hosts
> Linux environment is a Beowulf-style cluster, two interfaces  
> including one on a subnet shared by the SGE master.
>
> In this SGE - OpenMPI mixed environment I'm trying to troubleshoot  
> process spawning from within an MPI-enabled application. I can  
> exercise "mpirun" all day without any issues, interactively, but  
> spawning from within the application results in "A daemon failed to  
> start ?"
>
> I've attempted to use ssh instead of rsh just in case that is a  
> factor - however, the directions for replacing rsh with ssh do not  
> work on RedHat linux - for one thing, the directions specify "sshd - 
> i" so that sshd will start via xinetd - on RH sshd is not started  
> via xinet.d. In
>
the -i will only change the behavior of sshd. You don't need xinetd  
at all in the cluster. One sshd will be started by SGE per qrsh  
command on a random port. Hence there shouldn't be any firewall on  
the machines. As Open MPI is running interactivly, I assume this is  
already the case. You could check this on a node with "ps -e f" and  
the sshd chould be a kid of sge_shepherd.

It's possible to have a cluster without rsh and ssh being available  
all the time and force users also for an interactive login on a node  
to use qlogin/qrsh.
> my attempts to set that up, sgeexed segfaults on the target machine  
> and I get the indication that something is trying to start another  
> sshd on port 22. What is the proper value for rlogin daemon and  
> rlogin command, etc, for a Linux config?
>
It should use another port. I wonder how it's still trying to use  
port 22.

-- Reuti

> "May 14 12:43:50 marl-clus1-cn08 sshd[3121]: error: Bind to port 22  
> on 0.0.0.0 failed: Address already in use."
>
> Note, ssh by itself works just fine throughout the environment.
>
> Has anyone revised the directions at http:// 
> gridengine.sunsource.net/howto/qrsh_qlogin_ssh.html to work for RH  
> Linux?
>
> Thanks,
> Dan L.
>


---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
For additional commands, e-mail: users-help at gridengine.sunsource.net




More information about the gridengine-users mailing list