[GE users] ssh_exchange_identification

Mathias Goldau Mathias.Goldau at gmx.de
Thu Aug 9 08:24:50 BST 2007

Reuti schrieb:
> Am 08.08.2007 um 16:31 schrieb Mathias Goldau:
>> * As ordinary user I can't do a "qrsh hostname", but as root all is
>> well. The Error message I provide again is short:
>> -sh-3.00$ qrsh -l hostname=node23 -verbose hostname
>> local configuration frontend not defined - using global configuration
>> Your job 160386 ("hostname") has been submitted
>> waiting for interactive job to be scheduled ...
>> Your interactive job 160386 has been successfully scheduled.
>> Establishing /usr/bin/ssh session to host node23 ...
>> qrsh_starter: executing child process (null) failed: No such file or
>> directory
>> /usr/bin/ssh exited with exit code 0
>> reading exit code from shepherd ... 1
> You mean:
> qrsh /bin/hostname
> is giving the same "No such file or directoty" error message from
> qrsh_starter? At least: your ssh setup seems working, as otherwise the
> qrsh_starter wouldn't come up.

Yes it does come up with the same error message.

> OTOH: If you ssh by hand to this node, the "hostname" command is working
> for these normal/system users?

-sh-3.00$ ssh node23
Last login: Wed Aug  8 15:58:03 2007 from frontend
-sh-3.00$ /bin/hostname
-sh-3.00$ which hostname

> qrsh /bin/echo \$PATH
> saying something?

-sh-3.00$ qrsh /bin/echo \$PATH
qrsh_starter: executing child process (null) failed: No such file or

and "qrsh env" producing the same.

>>> - Can you please check in the messages file of the node, whether sshd
>>> was really set up in the last change of the configuration?
>> I did a "tail -f /var/log/messages" if you mean that and got the
>> following:
>> [...]
> No, I meant $SGE_ROOT/default/spool/node23/messages

Wohaa, there's somethig showning up: Every time I invoke a "qrsh -l
hostname=node23 hostname" something like this is produced:

08/09/2007 08:56:14|execd|node23|W|reaping job "160397" ptf complains:
Job does not exist

> After you change the SGE configuration with qconf -mconf, it will update
> to reflect the actual settings.

yes the message file for node23 updates itself with a reflection of the
global configuration, if you meant that.

>> could it be that my sge_shepherd is configured to work only with ldap?
>> My ordinary user isn't an ldap user. It is just a system user in
> What is in /etc/nsswitch.conf? It is possible to rely on nis and then
> skip the local files at all.

[root at frontend ~]# cat /etc/nsswitch.conf
passwd:     files ldap
shadow:     files ldap
group:      files ldap

hosts:      files dns

bootparams: files
ethers:     files
netmasks:   files
networks:   files
protocols:  files ldap
rpc:        files
services:   files ldap
netgroup:   files ldap
publickey:  files
automount:  files ldap
aliases:    files

>>> - While waiting for the return of "qrsh hostname": can you login to the
>>> node and check with "ps -e f" whether there was anything started by the
>>> shepherd?
>> headnode: qrsh -l hostname=node23 -verbose hostname
>> node23:   watch -n 0 "ps -e f | grep shepherd"
> Yes, thx - but this way we don't see any kids of the shepherd... Just
> copy the relevant lines from a "ps f -eo pid,ppid,pgrp,user,ruser,command"

 3303     1  3303 root     root     /opt/sge/bin/lx24-amd64/sge_execd
11907  3303 11907 root     root      \_ sge_shepherd-160403 -bg
11908 11907 11908 root     root          \_ sshd: me [priv]
11918 11908 11908 sshd     sshd              \_ sshd: me [net]

thanks, for your help.

