[GE users] trying tight ssh integration

rayson rayrayson at gmail.com
Wed Nov 19 02:03:31 GMT 2008


Great to hear that it works for you!!

I downloaded openssh-4.3p1.tar.gz a few days ago but I didn't have
time to look into the issue :-(

Also, 5.1 came out and I also did not try to use it with the tight
integration. BTW, can you provide more detail about the difference
between sgessh_do_setusercontext() and do_setusercontext()??

Thanks,
Rayson



On 11/18/08, Gerald Ragghianti <geri at utk.edu> wrote:
> I have solved the problem that I was having with tight ssh integration.
> The issue was that the function sgessh_do_setusercontext() is not a
> drop-in replacement for do_setusercontext() in openssh version 4.3p1
> (Redhat EL4.6).  The apparent differences between these two functions
> caused sshd to fail somewhere around where it tries to execute
> qrsh_starter.
>
> I was able to solve the problem by using openssh-3.9p1.  Everthing works
> fine now.  For future reference, here is my proceedure for building sshd:
> =====================================
> #!/bin/sh
> rm -rf gridengine
> tar -zxf ge-V61u5_TAG-src.tar.gz
> cp -f aimk.site gridengine/source/
> cp -f aimk gridengine/source/
>
> rm -rf openssh-3.9p1 gridengine/source/3rdparty/openssh
> tar -zxf openssh-3.9p1.tar.gz
> mv openssh-3.9p1 gridengine/source/3rdparty/openssh
> cp sshd.c.3.9p1 gridengine/source/3rdparty/openssh/sshd.c
>
> cd gridengine/source
> ./aimk -no-java -no-secure -spool-classic -no-jni -only-depend && \
> scripts/zerodepend && \
> ./aimk -no-java -no-secure -spool-classic -no-jni depend && \
> ./aimk -no-java -no-secure -spool-classic -no-jni && \
> ./aimk -no-java -no-secure -spool-classic -no-jni -tight-ssh
> =====================================
> Here is the diff that I used to patch sshd.c:
> 105a106,111
>  > #define SGESSH_INTEGRATION
>  > #ifdef SGESSH_INTEGRATION
>  > extern int sgessh_readconfig(void);
>  > extern int sgessh_do_setusercontext(struct passwd *);
>  > #endif
>  >
> 675c681,686
> <       do_setusercontext(authctxt->pw);
> ---
>  >       /* do_setusercontext(authctxt->pw); */
>  >  #ifdef SGESSH_INTEGRATION
>  >    sgessh_do_setusercontext(authctxt->pw);
>  >  #else
>  >    do_setusercontext(authctxt->pw);
>  >  #endif
> 899a911,914
>  >  #ifdef SGESSH_INTEGRATION
>  >    sgessh_readconfig();
>  >  #endif
>  >
>
> Once you have an sshd binary.  You simply need to copy somewhere on the
> execd machines and reference it as
>
> rsh_daemon                   /opt/n1ge/utilbin/lx24_amd64/sshd -i
>
> - Gerald
>
> rayson wrote:
> > There are only 2 things added in the tight sshd. One is for reading of
> > the environment (the job file), and the other is for switching of the
> > user account from root to the actual user using the SGE way.
> >
> > Can you add a few debug fprintf()s around sgessh_readconfig() and
> > sgessh_do_setusercontext() and record which one is causing the
> > failure?? . You may need to log to a file as you may not have
> > stdout/stderr access. If you don't know C, I may be able to write some
> > code for you when I have time later this month...
> >
> > There is also a presentation on the tight integration:
> > http://gridengine.sunsource.net/download/workshop10-12_09_07/SGE-WS2007-openSSHTightIntegration_RonChen.pdf
> >
> > Rayson
> >
> >
> >
> > On 11/15/08, Gerald Ragghianti <geri at utk.edu> wrote:
> >
> >> Yes, that's a good point that I left off.  We have been using "loose"
> >> ssh integration for a while now.  I can easily switch between using the
> >> stock distribution sshd (which works) and the sgesshd (which doesn't).
> >>
> >> - Gerald
> >>
> >>
> >>> Before trying the tight-integration sshd, please make sure that
> >>> non-tight integration
> >>> (http://gridengine.sunsource.net/howto/qrsh_qlogin_ssh.html) works.
> >>>
> >>> Rayson
> >>>
> >>>
> >>>
> >>> On 11/15/08, Gerald Ragghianti <geri at utk.edu> wrote:
> >>>
> >>>
> >>>> I am trying to get tight ssh integration working on my 6.1u5 system
> >>>> using openssh-4.3p1.  After successfully compiling with "aimk -no-java
> >>>> -no-secure -spool-classic -no-jni" I then compiled openssh with "aimk
> >>>> -no-java -no-secure -spool-classic -no-jni -tight-ssh".  This resulted
> >>>> in an sshd binary that I moved to $SGE_ROOT/utilbin/lx24-amd64/sshd.  I
> >>>> then updated rsh_daemon to point to this binary.  When I execute "qrsh
> >>>> -verbose id", the command returns:
> >>>>
> >>>> Your job 946 ("id") has been submitted
> >>>> waiting for interactive job to be scheduled ...
> >>>> Your interactive job 946 has been successfully scheduled.
> >>>> Establishing /usr/bin/ssh -X  session to host sun15.local ...
> >>>> /usr/bin/ssh -X  exited with exit code 254
> >>>> reading exit code from shepherd ... 129
> >>>>
> >>>> Log files:
> >>>>
> >>>> qmaster: job 946.1 failed on host sun15.local assumedly after job
> >>>> because: job 946.1 died through signal HUP (1)
> >>>>
> >>>> On the exec host: reaping job "946" ptf complains: Job does not exist
> >>>>
> >>>> When I change rsh_command to "/usr/bin/ssh -vX" I get the following from
> >>>> qrsh:
> >>>> ...
> >>>> debug1: Offering public key: /home/user/.ssh/id_rsa
> >>>> debug1: Server accepts key: pkalg ssh-rsa blen 149
> >>>> debug1: read PEM private key done: type RSA
> >>>> debug1: Authentication succeeded (publickey).
> >>>> debug1: channel 0: new [client-session]
> >>>> debug1: Entering interactive session.
> >>>> debug1: Requesting X11 forwarding with authentication spoofing.
> >>>> debug1: Sending command: exec '/opt/sge/utilbin/lx24-amd64/qrsh_starter'
> >>>> '/opt/sge/default/spool/sun15/active_jobs/946.1'
> >>>> debug1: client_input_channel_req: channel 0 rtype exit-status reply 0
> >>>> debug1: channel 0: free: client-session, nchannels 1
> >>>> debug1: Transferred: stdin 0, stdout 0, stderr 0 bytes in 0.1 seconds
> >>>> debug1: Bytes per second: stdin 0.0, stdout 0.0, stderr 0.0
> >>>> debug1: Exit status 254
> >>>> /usr/bin/ssh -vX  exited with exit code 254
> >>>> reading exit code from shepherd ... 129
> >>>>
> >>>> This seems to indicated that the ssh authentication succeeds, but that
> >>>> the qrsh_starter fails to execute.  I have an strace of the execd that
> >>>> shows sshd being executed and subsequently rummaging around the
> >>>> $SGE_ROOT and correctly setting the groupid before exiting.
> >>>>
> >>>> Any ideas?
> >>>>
> >>>> --
> >>>> Gerald Ragghianti
> >>>> IT Administrator - High Performance Computing
> >>>> http://hpc.usg.utk.edu/
> >>>> Office of Information Technology
> >>>> University of Tennessee
> >>>> Phone: 865-974-2448
> >>>> E-mail: geri at utk.edu
> >>>>
> >>>> ------------------------------------------------------
> >>>> http://gridengine.sunsource.net/ds/viewMessage.do?dsForumId=38&dsMessageId=88823
> >>>>
> >>>> To unsubscribe from this discussion, e-mail: [users-unsubscribe at gridengine.sunsource.net].
> >>>>
> >>>>
> >>>>
> >>> ------------------------------------------------------
> >>> http://gridengine.sunsource.net/ds/viewMessage.do?dsForumId=38&dsMessageId=88824
> >>>
> >>> To unsubscribe from this discussion, e-mail: [users-unsubscribe at gridengine.sunsource.net].
> >>>
> >>>
> >> --
> >> Gerald Ragghianti
> >> IT Administrator - High Performance Computing
> >> http://hpc.usg.utk.edu/
> >> Office of Information Technology
> >> University of Tennessee
> >> Phone: 865-974-2448
> >> E-mail: geri at utk.edu
> >>
> >> ------------------------------------------------------
> >> http://gridengine.sunsource.net/ds/viewMessage.do?dsForumId=38&dsMessageId=88825
> >>
> >> To unsubscribe from this discussion, e-mail: [users-unsubscribe at gridengine.sunsource.net].
> >>
> >>
> >
> > ------------------------------------------------------
> > http://gridengine.sunsource.net/ds/viewMessage.do?dsForumId=38&dsMessageId=88827
> >
> > To unsubscribe from this discussion, e-mail: [users-unsubscribe at gridengine.sunsource.net].
> >
>
>
> --
> Gerald Ragghianti
> IT Administrator - High Performance Computing
> http://hpc.usg.utk.edu/
> Office of Information Technology
> University of Tennessee
> Phone: 865-974-2448
> E-mail: geri at utk.edu
>
> ------------------------------------------------------
> http://gridengine.sunsource.net/ds/viewMessage.do?dsForumId=38&dsMessageId=89030
>
> To unsubscribe from this discussion, e-mail: [users-unsubscribe at gridengine.sunsource.net].
>

------------------------------------------------------
http://gridengine.sunsource.net/ds/viewMessage.do?dsForumId=38&dsMessageId=89031

To unsubscribe from this discussion, e-mail: [users-unsubscribe at gridengine.sunsource.net].



More information about the gridengine-users mailing list