[GE users] SSH connection refused - intermittent

reuti reuti at staff.uni-marburg.de
Sat Mar 20 22:30:45 GMT 2010


Hi,

Am 15.03.2010 um 16:01 schrieb giftedplacebo:

> We have been running sge for several years now, currently running  
> 6.0u10. We recently started seeing lots of ssh failures (5-9%) like  
> the following:
>
> ssh: connect to host grid057.<mydomain>.com port 45364: Connection  
> refused

is the execd running on some machines not as root? Or is this  
happening on all machines in the cluster and not only certain ones?

-- Reuti


> grid057 appears to accept the connection, this is the corresponding / 
> var/log/messages entry:
>
> Mar  9 06:36:15 grid057 sshd[14936]: Accepted publickey for  
> <username> from 172.16.14.157 port 45364 ssh2
>
> (<mydomain> and <username> have been removed for privacy.)
>
> On all grid nodes I have selinux and iptables disabled.
>
> sshd is running with the following /etc/ssh/sshd_config
>
> X11Forwarding yes
> PrintMotd no
> MaxStartups 10000:1:10000
> Subsystem       sftp    /usr/libexec/openssh/sftp-server
>
> /etc/ssh/ssh_config on all nodes is:
>
> Host *
>    RhostsRSAAuthentication yes
>    StrictHostKeyChecking no
>    ConnectionAttempts 20
>
> I have also set the following to 3000:
>
> /proc/sys/net/core/netdev_max_backlog
> /proc/sys/net/core/somaxconn
>
> The problem is across all machines, and only affects ~5-9% of ssh  
> connections. I don't see any error messages on the machines, just  
> the ssh failure notice in our job log files. Does anyone have ideas  
> or tips on tuning ssh/sshd? Thanks!
>
> Best regards,
> Aaron
>

------------------------------------------------------
http://gridengine.sunsource.net/ds/viewMessage.do?dsForumId=38&dsMessageId=250055

To unsubscribe from this discussion, e-mail: [users-unsubscribe at gridengine.sunsource.net].



More information about the gridengine-users mailing list