[GE users] What's the consequence if I removed these lines from sge_conf

igardais igardais at yahoo.fr
Wed Jan 6 06:53:57 GMT 2010

    [ The following text is in the "iso-8859-1" character set. ]
    [ Your display is set for the "ISO-8859-10" character set.  ]
    [ Some special characters may be displayed incorrectly. ]

Thanks for this Reuti.

What about rsh interception when using "builtin" commands ?
All my mpi scripts specify "--rsh=/usr/bin/ssh" to use the classic key-based password-less login but with little control over the job.
I'm considering rsh-interception but my first attemps (a few years back now) were unsuccessful.

Any hints ?


De : reuti <reuti at staff.uni-marburg.de>
? : users at gridengine.sunsource.net
Envoyé le : Mer 6 Janvier 2010, 1 h 56 min 40 s
Objet : Re: [GE users] What's the consequence if I removed these lines from sge_conf

Am 06.01.2010 um 01:40 schrieb kdoman:

> What's the consequence of removing the lines below from sge conf? If I
> don't, we cannot submit any parallel jobs that request "-pe orte"
> greater than 4.
> qrsh_command                /usr/bin/ssh
> rsh_command                  /usr/bin/ssh
> rlogin_command              /usr/bin/ssh

The definition of the the *_command must match the ones of the
*_daemon. It defines what mechanism will be used to start interactive
jobs or slave tasks. You can have:

Classic rsh startup (e.g. for x86):

qlogin_command              /usr/bin/telnet
qlogin_daemon                /usr/sbin/in.telnetd
rlogin_command              /usr/sge/utilbin/lx24-x86/rlogin
rlogin_daemon                /usr/sbin/in.rlogind
rsh_command                  /usr/sge/utilbin/lx24-x86/rsh
rsh_daemon                  /usr/sge/utilbin/lx24-x86/rshd -l

All builtin:

qlogin_command              builtin
qlogin_daemon                builtin
rlogin_command              builtin
rlogin_daemon                builtin
rsh_command                  builtin
rsh_daemon                  builtin

or ssh according to:


The three options qlogin_*, rlogin_* and rsh_* must be conistent, but
can be different for each pair of them of course.

Also note, that these entries can be overwritten on an exechost
level, i.e. its local configuration: qconf -mconf <exechost>

-- Reuti

> Without the above modification, any job submission with -pe orte
> greater than 4 would received this error:
> error: error: ending connection before all data received
> error:
> error reading job context from "qlogin_starter"
> ----------------------------------------------------------------------
> ----
> A daemon (pid 2160) died unexpectedly with status 1 while attempting
> to launch so we are aborting.
> There may be more information reported by the environment (see above).
> This may be because the daemon was unable to find all the needed
> shared
> libraries on the remote node. You may set your LD_LIBRARY_PATH to
> have the
> location of the shared libraries on the remote nodes and this will
> automatically be forwarded to the remote nodes.
> ----------------------------------------------------------------------
> ----
> ----------------------------------------------------------------------
> ----
> mpirun noticed that the job aborted, but has no info as to the process
> that caused that situation.
> ----------------------------------------------------------------------
> ----
> mpirun: clean termination accomplished
> Thanks.
> K.
> ------------------------------------------------------
> http://gridengine.sunsource.net/ds/viewMessage.do?
> dsForumId=38&dsMessageId=236695
> To unsubscribe from this discussion, e-mail: [users-
> unsubscribe at gridengine.sunsource.net<mailto:unsubscribe at gridengine.sunsource.net>].


To unsubscribe from this discussion, e-mail: [users-unsubscribe at gridengine.sunsource.net<mailto:users-unsubscribe at gridengine.sunsource.net>].

More information about the gridengine-users mailing list