[GE users] qlogin and sshd errors (and JOB_ID qlogin environment)n

Andreas Haas Andreas.Haas at Sun.COM
Thu Nov 3 16:29:40 GMT 2005


I encountered use of $SGE_JOB_SPOOL_DIR/environment does not help
you since use of $HOME/.ssh/environment is probably required to make

   sshd -o 'AcceptEnv

work. Sorry for the confusion.

Cheers,
Andreas


On Thu, 3 Nov 2005, Andreas Haas wrote:

> Federico,
>
> you could do essentially the same in a prolog procudure that run
> as user sgeadmin. Note in this case you would modify
>
>    $SGE_JOB_SPOOL_DIR/environment
>
> rather than $HOME/.ssh/environment. The Grid Engine environment file
> is read in shortly before the job resp. the sshd is being started.
>
> See qsub(1) for docu on SGE_JOB_SPOOL_DIR and sge_conf(5)
> or queue_conf(6) for docu on prolog.
>
> Regards,
> Andreas
>
>
> On Wed, 2 Nov 2005, Sacerdoti, Federico wrote:
>
> > Actually, I found the problem: I am using qlogin over ssh, and the sshd
> > ignores all environment variables when it starts. Basically sge-shepherd
> > does set the correct evironment, but they don't make it to the final
> > session.
> >
> > The work around is to use an sshd wrapper. This strategy hijacks the
> > $HOME/.ssh/environment facility:
> >
> > ---
> > #!/bin/sh
> > # Author: D.E.Shaw R&D LLC, F.D.Sacerdoti 2005
> > #
> > # SSHD will erase the helpful env vars that sge puts in. This forces
> > # them to survive, but we usurp the $HOME/.ssh/environment file.
> > #
> > env > $HOME/.ssh/environment
> > echo "SGE_HOSTLIST=$SGE_O_HOME/$JOB_NAME.po$JOB_ID" >>
> > $HOME/.ssh/environment
> >
> > /usr/sbin/sshd -i -b 512 -o 'AcceptEnv *' -o 'PermitUserEnvironment yes'
> >
> > rm -f $HOME/.ssh/environment
> > ---
> >
> > But it works. A better way would be to instruct sshd to simply absorb
> > the calling environment, but there does not seem to be a flag or option
> > for that.
> >
> > -Federico
> >
> > -----Original Message-----
> > From: Reuti [mailto:reuti at staff.uni-marburg.de]
> > Sent: Tuesday, November 01, 2005 4:47 PM
> > To: users at gridengine.sunsource.net
> > Subject: Re: [GE users] qlogin and sshd errors (and JOB_ID qlogin
> > environment)
> >
> >
> > Hi Federico,
> >
> > Am 01.11.2005 um 21:42 schrieb Sacerdoti, Federico:
> >
> > > Thanks Reuti,
> > >
> > > I found the problem for qlogin/qrsh. The per-host configuration of
> > > 'qlogin-daemon' was set to '/usr/sbin/in.telnetd', so no matter
> > > what the
> > > default (cluster-wide) value is, the /usr/sbin/in.telnetd daemon was
> > > started. Once this was fixed things went smoothly.
> > >
> > > I have another question. When I qlogin/qrsh via SSH, I do not get the
> > > JOB_ID environment variable. In fact none of the SGE_O_* variables are
> > > available. I have turned on
> > >
> > > -o 'SendEnv *' and
> > >
> > > -o 'AcceptEnv *'
> >
> > this would send the variables from the login machine to the selected
> > node for your interactive job. But also on the login machine is only
> > the normal environment set. You can try to set them by hand or like
> > Roland suggested for interactive qrsh:
> >
> > http://gridengine.sunsource.net/servlets/ReadMsg?list=users&msgNo=13465
> >
> > You can try to use /bin/sh or /bin/bash (the sourced files will be
> > different, although it's the same binary on Linux). Also the options -
> > i and -l (lowercase L) might be interesting. Sometimes you don't get
> > a prompt and just have to type the commands. For qlogin maybe it's
> > impossible for now.
> >
> > Cheers - Reuti
> >
> >
> > > but to no avail.
> > >
> > > Thanks,
> > > -Federico
> > >
> > > -----Original Message-----
> > > From: Reuti [mailto:reuti at staff.uni-marburg.de]
> > > Sent: Tuesday, October 25, 2005 5:03 PM
> > > To: users at gridengine.sunsource.net
> > > Subject: Re: [GE users] qlogin and sshd errors
> > >
> > >
> > > Correct, SGE is installed so that the daemons run as root, what is
> > > the suggested operation mode. Is yours running under your user
> > > account? In this case you can submit just serial jobs, but the qrsh
> > > for parallel jobs won't work also.
> > >
> > > You can check this e.g. with:
> > >
> > > $ ps -e f -o ruser,euser,rgroup,egroup,command
> > > ...
> > > root     sgeadmin root     gridware /usr/sge/bin/lx24-x86/sge_execd
> > > root     sgeadmin root     gridware  \_ sge_shepherd-374 -bg
> > >
> > >
> > > Cheers  - Reuti
> > >
> > >
> > > Am 25.10.2005 um 20:29 schrieb Sacerdoti, Federico:
> > >
> > >> Thanks, this is good to know it works for you.
> > >>
> > >> Do you run sge as root? I am seeing permissions problems with sshd...
> > >>
> > >> -fds
> > >>
> > >> -----Original Message-----
> > >> From: Reuti [mailto:reuti at staff.uni-marburg.de]
> > >> Sent: Monday, October 24, 2005 4:48 PM
> > >> To: users at gridengine.sunsource.net
> > >> Subject: Re: [GE users] qlogin and sshd errors
> > >>
> > >>
> > >> Hi Federico,
> > >>
> > >> Am 24.10.2005 um 21:39 schrieb Sacerdoti, Federico:
> > >>
> > >>
> > >>> Hi,
> > >>>
> > >>> I apologize if this has already been answered. I would like to use
> > >>> qlogin with ssh, and followed the instructions here
> > >>>
> > >>> http://gridengine.sunsource.net/howto/qrsh_qlogin_ssh.html
> > >>>
> > >>> While qlogin does schedule my job correctly, and the sshd gets
> > >>> started,
> > >>> I cannot connect to it. My qlogin-wrapper shows which port and
> > >>> host to
> > >>> connect to (I have restricted my sge pool to one host to make things
> > >>> easier).
> > >>>
> > >>> I get the following strange error when I try to connect to the port
> > >>> that
> > >>> SGE wants me to. Has anyone seen this?:
> > >>>
> > >>> [fds at drdab000 .ssh]$ ssh -vvv drda1054 -p 35072
> > >>> OpenSSH_3.9p1, OpenSSL 0.9.7a Feb 19 2003
> > >>> debug2: ssh_connect: needpriv 0
> > >>> debug1: Connecting to drda1054 [10.255.4.60] port 35072.
> > >>> debug1: Connection established.
> > >>> debug1: identity file /u/fds/.ssh/identity type -1
> > >>> debug3: Not a RSA1 key file /u/fds/.ssh/id_rsa.
> > >>> debug2: key_type_from_name: unknown key type '-----BEGIN'
> > >>> debug3: key_read: missing keytype
> > >>> debug2: key_type_from_name: unknown key type 'Proc-Type:'
> > >>> debug3: key_read: missing keytype
> > >>> debug2: key_type_from_name: unknown key type 'DEK-Info:'
> > >>> debug3: key_read: missing keytype
> > >>> debug3: key_read: missing whitespace
> > >>> debug3: key_read: missing whitespace
> > >>> debug3: key_read: missing whitespace
> > >>> debug3: key_read: missing whitespace
> > >>> debug3: key_read: missing whitespace
> > >>> debug3: key_read: missing whitespace
> > >>> debug3: key_read: missing whitespace
> > >>> debug3: key_read: missing whitespace
> > >>> debug3: key_read: missing whitespace
> > >>> debug3: key_read: missing whitespace
> > >>> debug3: key_read: missing whitespace
> > >>> debug3: key_read: missing whitespace
> > >>> debug3: key_read: missing whitespace
> > >>> debug2: key_type_from_name: unknown key type '-----END'
> > >>> debug3: key_read: missing keytype
> > >>> debug1: identity file /u/fds/.ssh/id_rsa type 1
> > >>> debug1: identity file /u/fds/.ssh/id_dsa type -1
> > >>> ssh_exchange_identification: Connection closed by remote host
> > >>>
> > >>
> > >> for me it's working under 6.0u4 and SuSE 9.3. So it may be an issue
> > >> with your ssh setup. You created the keys with ssh-keygen and copied
> > >> the public one to authorized keys? Can you try to delete the key
> > >> information and generate new ones?
> > >>
> > >> Only difference is the version: "OpenSSH_3.9p1, OpenSSL 0.9.7e 25 Oct
> > >> 2004"  for me. Maybe you are getting the "e" version from the
> > >> included libs in SGE: "ldd /usr/bin/ssh". You can try to change the
> > >> LD_LIBRARY_PATH (same OS on the nodes and your login machine?). -
> > >> Reuti
> > >>
> > >>
> > >> ---------------------------------------------------------------------
> > >> To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
> > >> For additional commands, e-mail: users-help at gridengine.sunsource.net
> > >>
> > >>
> > >> ---------------------------------------------------------------------
> > >> To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
> > >> For additional commands, e-mail: users-help at gridengine.sunsource.net
> > >>
> > >>
> > >
> > >
> > > ---------------------------------------------------------------------
> > > To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
> > > For additional commands, e-mail: users-help at gridengine.sunsource.net
> > >
> > >
> > >
> > > ---------------------------------------------------------------------
> > > To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
> > > For additional commands, e-mail: users-help at gridengine.sunsource.net
> > >
> >
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
> > For additional commands, e-mail: users-help at gridengine.sunsource.net
> >
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
> > For additional commands, e-mail: users-help at gridengine.sunsource.net
> >
> >
>

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
For additional commands, e-mail: users-help at gridengine.sunsource.net




More information about the gridengine-users mailing list