[GE users] rcmd: socket permission denied using openmpi

Lydia Heck lydia.heck at durham.ac.uk
Sun Jul 29 14:58:32 BST 2007


I am using openmpi and tight integration.

The permissions on my sparc systems on qrsh rsh etc are set exactly and on
my Solaris x86 systems. On the solaris x86 systems openmpi and sge work
find over rsh. on the Sparc system Solaris 10 they fail.

The permissions are in both cases - and in both cases they have not been
modified after the installation.

The mount on the shared partition is SUID

on solaris x86/64

oberon# cd sol-amd64/
oberon# ls -alrt
total 22318
drwxr-xr-x   3 root     other        512 Jan 11  2006 ..
-rwxr-xr-x   1 root     root       22920 May  3 08:56 uidgid
-rwsr-xr-x   1 root     root       23848 May  3 08:56 testsuidroot
-rwxr-xr-x   1 root     root      385416 May  3 08:56 spooledit
-rwxr-xr-x   1 root     root      339640 May  3 08:56 spooldefaults
-rwxr-xr-x   1 root     root     2207136 May  3 08:56 sge_share_mon
-rwxr-xr-x   1 root     root      293536 May  3 08:56 qrsh_starter
-rwxr-xr-x   1 root     root       21296 May  3 08:56 now
-rwxr-xr-x   1 root     root      211592 May  3 08:56 loadcheck
-rwxr-xr-x   1 root     root      206704 May  3 08:56 infotext
-rwxr-xr-x   1 root     root      188624 May  3 08:56 getservbyname
-rwxr-xr-x   1 root     root      864120 May  3 08:56 gethostname
-rwxr-xr-x   1 root     root      863592 May  3 08:56 gethostbyname
-rwxr-xr-x   1 root     root      863744 May  3 08:56 gethostbyaddr
-rwxr-xr-x   1 root     root      187624 May  3 08:56 filestat
-rwxr-xr-x   1 root     root      187008 May  3 08:56 checkuser
-rwxr-xr-x   1 root     root      312448 May  3 08:56 checkprog
-rwsr-xr-x   1 root     root      185584 May  3 08:56 authuser
-rwxr-xr-x   1 root     root      188056 May  3 08:56 adminrun
-rwxr-xr-x   1 root     root     2199488 May  3 08:56 spoolinit
-rwxr-xr-x   1 root     root      348224 May  3 08:56 rshd
-rwsr-xr-x   1 root     root       18624 May  3 08:56 rsh
-rwsr-xr-x   1 root     root       32168 May  3 08:56 rlogin
-rwxr-xr-x   1 root     root      598840 May  3 08:56 openssl
-rwxr-xr-x   1 root     root       25248 May  3 08:56 fstype
-rwxr-xr-x   1 root     root       16352 May  3 08:56 db_verify
-rwxr-xr-x   1 root     root       13992 May  3 08:56 db_upgrade
-rwxr-xr-x   1 root     root       19992 May  3 08:56 db_stat
-rwxr-xr-x   1 root     root       14992 May  3 08:56 db_recover
-rwxr-xr-x   1 root     root      107176 May  3 08:56 db_printlog
-rwxr-xr-x   1 root     root       38632 May  3 08:56 db_load
-rwxr-xr-x   1 root     root       21688 May  3 08:56 db_dump
-rwxr-xr-x   1 root     root       17496 May  3 08:56 db_deadlock
-rwxr-xr-x   1 root     root       17096 May  3 08:56 db_checkpoint
-rwxr-xr-x   1 root     root       13272 May  3 08:56 db_archive
-rwxr-xr-x   1 root     root      112520 May  3 08:56 berkeley_db_svc
drwxr-xr-x   2 root     root        1024 May  3 08:56 .


on sparc:



m1033# ls -alrt
total 20784
-rwxr-xr-x   1 root     root       23032 May  3 08:56 uidgid
-rwxr-xr-x   1 root     root       20864 May  3 08:56 now
-rwxr-xr-x   1 root     root      187800 May  3 08:56 loadcheck
-rwxr-xr-x   1 root     root      165408 May  3 08:56 getservbyname
-rwxr-xr-x   1 root     root      813864 May  3 08:56 gethostname
-rwxr-xr-x   1 root     root      813072 May  3 08:56 gethostbyname
-rwxr-xr-x   1 root     root      813072 May  3 08:56 gethostbyaddr
-rwxr-xr-x   1 root     root      164488 May  3 08:56 filestat
-rwxr-xr-x   1 root     root      282472 May  3 08:56 checkprog
-rwsr-xr-x   1 root     root       23928 May  3 08:56 testsuidroot
-rwxr-xr-x   1 root     root     2044008 May  3 08:56 spoolinit
-rwxr-xr-x   1 root     root      358736 May  3 08:56 spooledit
-rwxr-xr-x   1 root     root      323624 May  3 08:56 spooldefaults
-rwxr-xr-x   1 root     root     2050624 May  3 08:56 sge_share_mon
-rwxr-xr-x   1 root     root      311000 May  3 08:56 rshd
-rwsr-xr-x   1 root     root       19000 May  3 08:56 rsh
-rwsr-xr-x   1 root     root       30824 May  3 08:56 rlogin
-rwxr-xr-x   1 root     root      263976 May  3 08:56 qrsh_starter
-rwxr-xr-x   1 root     root      658720 May  3 08:56 openssl
-rwxr-xr-x   1 root     root      186112 May  3 08:56 infotext
-rwxr-xr-x   1 root     root       25136 May  3 08:56 fstype
-rwxr-xr-x   1 root     root       14544 May  3 08:56 db_verify
-rwxr-xr-x   1 root     root       12896 May  3 08:56 db_upgrade
-rwxr-xr-x   1 root     root       18056 May  3 08:56 db_stat
-rwxr-xr-x   1 root     root       13776 May  3 08:56 db_recover
-rwxr-xr-x   1 root     root       93064 May  3 08:56 db_printlog
-rwxr-xr-x   1 root     root       33096 May  3 08:56 db_load
-rwxr-xr-x   1 root     root       18904 May  3 08:56 db_dump
-rwxr-xr-x   1 root     root       15640 May  3 08:56 db_deadlock
-rwxr-xr-x   1 root     root       15296 May  3 08:56 db_checkpoint
-rwxr-xr-x   1 root     root       12240 May  3 08:56 db_archive
-rwxr-xr-x   1 root     root      163624 May  3 08:56 checkuser
-rwxr-xr-x   1 root     root       90200 May  3 08:56 berkeley_db_svc
-rwsr-xr-x   1 root     root      162432 May  3 08:56 authuser
-rwxr-xr-x   1 root     root      164512 May  3 08:56 adminrun
drwxr-xr-x   2 root     root        1024 May  3 08:56 .
drwxr-xr-x   3 root     root         512 Jul 18 11:56 ..


On Fri, 27 Jul 2007, Rayson Ho wrote:

> If you are using loose integration, then you should make sure that the
> system's rsh is setuid root.
>
> If you are using tight integration, then you just need to make sure
> that you follow "Permissions for qrsh are not set properly":
> http://gridengine.sunsource.net/howto/commonproblems.html#interactive
>
> Rayson
>
>
> On 7/27/07, Lydia Heck <lydia.heck at durham.ac.uk> wrote:
> >
> > do you mean the  /opt/SUNWsge/utilbin/sol-sparc64/rsh
> > or the system rsh?
> >
> > Lydia
> >
> > On Fri, 27 Jul 2007, Aaron Knister wrote:
> >
> > > Make sure rsh is setuid root. Not sure how to do that on solaris but
> > > on linux its "chmod u+s /path/to/rsh".
> > >
> > > -Aaron
> > >
> > > On Jul 27, 2007, at 1:29 PM, Lydia Heck wrote:
> > >
> > > >
> > > > Using openmpi - and currently I am still using rsh - rather than ssh
> > > > I can run an openmpi on Solaris 10 sparc interactively without a
> > > > problem, but as
> > > > soon as I submit it using sge it fails with
> > > >
> > > >  rcmd: socket permission
> > > >
> > > > rsh and rlogin have the permission
> > > >
> > > > rws--x--x in /opt/SUNWsge/utilbin/sol-sparc64
> > > > I have set
> > > > MPI_MCA_pls_rsh_agent=/usr/bin/rsh
> > > >
> > > > and this setup works without problem on Solaris10 x86, same version
> > > > of sge (6.1)
> > > > and same version of openmpi.
> > > >
> > > > Any suggestions?
> > > >
> > > > Lydia
> > > >
> > > > ------------------------------------------
> > > > Dr E L  Heck
> > > >
> > > > University of Durham
> > > > Institute for Computational Cosmology
> > > > Ogden Centre
> > > > Department of Physics
> > > > South Road
> > > >
> > > > DURHAM, DH1 3LE
> > > > United Kingdom
> > > >
> > > > e-mail: lydia.heck at durham.ac.uk
> > > >
> > > > Tel.: + 44 191 - 334 3628
> > > > Fax.: + 44 191 - 334 3645
> > > > ___________________________________________
> > > >
> > > > ---------------------------------------------------------------------
> > > > To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
> > > > For additional commands, e-mail: users-help at gridengine.sunsource.net
> > > >
> > >
> > > Aaron Knister
> > > Systems Administrator/
> > > Center for Research on Environment and Water
> > >
> > > (301) 595-7001
> > > aaron at iges.org
> > >
> > >
> > >
> > >
> >
> > ------------------------------------------
> > Dr E L  Heck
> >
> > University of Durham
> > Institute for Computational Cosmology
> > Ogden Centre
> > Department of Physics
> > South Road
> >
> > DURHAM, DH1 3LE
> > United Kingdom
> >
> > e-mail: lydia.heck at durham.ac.uk
> >
> > Tel.: + 44 191 - 334 3628
> > Fax.: + 44 191 - 334 3645
> > ___________________________________________
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
> > For additional commands, e-mail: users-help at gridengine.sunsource.net
> >
> >
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
> For additional commands, e-mail: users-help at gridengine.sunsource.net
>

------------------------------------------
Dr E L  Heck

University of Durham
Institute for Computational Cosmology
Ogden Centre
Department of Physics
South Road

DURHAM, DH1 3LE
United Kingdom

e-mail: lydia.heck at durham.ac.uk

Tel.: + 44 191 - 334 3628
Fax.: + 44 191 - 334 3645
___________________________________________

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
For additional commands, e-mail: users-help at gridengine.sunsource.net




More information about the gridengine-users mailing list