[GE users] qrsh fails

Jean-Paul Minet minet at cism.ucl.ac.be
Fri Jan 13 16:49:04 GMT 2006

    [ The following text is in the "ISO-8859-1" character set. ]
    [ Your display is set for the "ISO-8859-10" character set.  ]
    [ Some special characters may be displayed incorrectly. ]


I am trying to get tight integration to work (MPICH 1.2.6 et SGE 6.0u6) and face 
a problem with qrsh.  Trying to debug it separately from the integration bit, I 
obtain a "poll:protocol failure in circuit setup" on the host initiating the 
qrsh (cfr. below).  On the target host, I get the following wierd messages:

Message from syslogd at lmexec-92 at Fri Jan 13 10:47:21 2006 ...
lmexec-92 kernel: Oops: 0000 [2] SMP

Message from syslogd at lmexec-92 at Fri Jan 13 10:47:21 2006 ...
lmexec-92 kernel: CR2: 0000000000000108

We use SUSE 9.0 (kernel 2.6.5-7.97-smp) on Sun V20z (bi-opteron).

Would someone have an idea on how to further debug the problem (I have tried 
using tcpdump between the submit host and the target host, as well as the 
qmaster host and the target host, to dig into communication bits, but it's 
getting complicated...)?

Thks for any help


---- qrsh command and output ----
lemaitre /gridware/sge/bin/lx24-amd64 # qrsh -verbose -l mem_free=10M -l 
num_proc=2 -q all.q at lmexec-92 date
local configuration lemaitre.cism.ucl.ac.be not defined - using global configuration
your job 1788 ("date") has been submitted
waiting for interactive job to be scheduled ...
Your interactive job 1788 has been successfully scheduled.
Establishing /gridware/sge/utilbin/lx24-amd64/rsh session to host lmexec-92 ...
poll: protocol failure in circuit setup
/gridware/sge/utilbin/lx24-amd64/rsh exited with exit code 1
reading exit code from shepherd ... 129

Jean-Paul Minet
Gestionnaire CISM - Institut de Calcul Intensif et de Stockage de Masse
Université Catholique de Louvain
Tel: (32) (0) - Fax: (32) (0)

To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
For additional commands, e-mail: users-help at gridengine.sunsource.net

More information about the gridengine-users mailing list