[GE users] Running SGE as user "sge" on AIX

Erik Lönroth erik.lonroth at scania.com
Thu Aug 30 14:06:31 BST 2007


    [ The following text is in the "utf-8" character set. ]
    [ Your display is set for the "ISO-8859-10" character set.  ]
    [ Some characters may be displayed incorrectly. ]

Reuti!

Running sge as user "sge" is working perfectly for us and all our users
can submit jobs under their own uid:s.

sge_execd is running as "sge" on all our linux exec-hosts and daemons
are started by "root" and probably setuid after initial execution as you
see below. 

ps -ef | grep sge
sge      29874     1  0 14:48 ? 00:00:00 sge_execd


However - on AIX (started exactly in the same way):
ps -ef | grep sge
root 561178      1   0 15:00:11      -  0:00 ./sge_execd 

The problem is that the "spool" directory is on NFS (root squished) and
therefore root will have problems writing to that directory.

Trying to start sge_execd on aix - as sge - wont work since the low port
537+536 are restricted for users like sge.

This is how it turns out:

sge at ts002.../common >./sgeexecd.test  
+ [ 0 -gt 2 -o  = -h -o  = help ]
+ startup=true
+ execd=true
+ qstd=false
+ softstop=false
+ + GetPathToBinaries
bin_dir=/opt/gridengine/bin/aix53
+ [ /opt/gridengine/bin/aix53 = none ]
+ + GetPathToUtilbin
utilbin_dir=/opt/gridengine/utilbin/aix53
+ [ /opt/gridengine/utilbin/aix53 = none ]
+ + /opt/gridengine/utilbin/aix53/gethostname -aname
HOST=ts002.sss.se.scania.com
+ + cut -f1 -d.
+ /opt/gridengine/utilbin/aix53/gethostname -name
UQHOST=ts002
+ [ true = true ]
+ echo    starting sge_execd
   starting sge_execd
+ /opt/gridengine/bin/aix53/sge_execd
error: communication error for "ts002.sss.se.scania.com/execd/1" running
on port 537: "can't bind socket"
error: commlib error: can't bind socket (no additional information
available)
..........................
critical error: abort execd startup due to communication errors
+ [ 1 -eq 0 -a -d /var/lock/subsys ]
+ exit 0

/Erik


On tor, 2007-08-30 at 14:45 +0200, Reuti wrote:
> Am 30.08.2007 um 14:17 schrieb Erik Lönroth:
> 
> > I'm trying to work out how to get SGE (execd) running as user "sge" on
> > AIX53.
> >
> > The process starts OK as user root, but I can't get it to start as  
> > user
> > "sge". This is most likey due to the sge being prevented to bind to
> > sockets/ports lower than 1024. We use 537 and this causes problems  
> > when
> > the process can't bind.
> >
> > The solution is -probably- to allow the sge user to open the ports
> > nessesary - but I dont know how, not being a AIX techy.
> 
> If it's not running as root, it will only be usable by the user "sge"  
> to use it. Most likely this is not what you want. - Reuti
> 
> > Any help on this would be much apprechiated.
> >
> > /Erik
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
> > For additional commands, e-mail: users-help at gridengine.sunsource.net
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
> For additional commands, e-mail: users-help at gridengine.sunsource.net
> 

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
For additional commands, e-mail: users-help at gridengine.sunsource.net




More information about the gridengine-users mailing list