[GE users] qlogin wrapper using ssh in qstat dying

Patrice Seyed apseyed at bu.edu
Tue Nov 30 17:23:50 GMT 2004


Hello,

I have a qlogin_wrapper script now that looks like the following:
#!/bin/sh
exec /usr/bin/ssh -o ConnectionAttempts=8 $1

My corresponding configuration on sge(5.3p5, also note the spool directory
is local for each node) is:

qlogin_command            /opt/gridengine/bin/glinux/qlogin_wrapper
qlogin_daemon             /usr/sbin/sshd

When I run "qlogin" here is what I see as the user:

$ qlogin
waiting for interactive job to be scheduled ...
Your interactive job 283950 has been successfully scheduled.
Establishing /opt/gridengine/bin/glinux/qlogin_wrapper session to host
compute-3-11.local ...

And then dropped into the prompt. "qstat" then shows the job, first in state
"t" for transferring, and then "r" for running. Everything looks fine until
after about 20-30 seconds, the QLOGIN job is no longer present in the qstat
table, even though the session is still active.

Only error message seen when qlogin on qstat dies:
Tue Nov 30 11:38:59 2004|qmaster|<master node>|W|job 283948.1 failed on host
compute-9-5.local  assumedly after job because: job 283948.1 died through
signal KILL (9)

Any ideas on why this is occurring? And any known solutions for this
problem?

Cheers,

Patrice Seyed
Linux System Administrator - LinGA
RHCE, SCSA
Boston University Medical Campus



---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
For additional commands, e-mail: users-help at gridengine.sunsource.net




More information about the gridengine-users mailing list