[GE users] Running SGE as user "sge" on AIX

Reuti reuti at staff.uni-marburg.de
Thu Aug 30 14:40:47 BST 2007


    [ The following text is in the "ISO-8859-1" character set. ]
    [ Your display is set for the "ISO-8859-10" character set.  ]
    [ Some special characters may be displayed incorrectly. ]

Am 30.08.2007 um 15:06 schrieb Erik Lönroth:

> Reuti!
>
> Running sge as user "sge" is working perfectly for us and all our  
> users
> can submit jobs under their own uid:s.
>
> sge_execd is running as "sge" on all our linux exec-hosts and daemons
> are started by "root" and probably setuid after initial execution  
> as you
> see below.
>
> ps -ef | grep sge
> sge      29874     1  0 14:48 ? 00:00:00 sge_execd

What is:

ps -eo user,ruser,command

saying on Linux? There is an effective and a real user.

-- Reuti

> However - on AIX (started exactly in the same way):
> ps -ef | grep sge
> root 561178      1   0 15:00:11      -  0:00 ./sge_execd
>
> The problem is that the "spool" directory is on NFS (root squished)  
> and
> therefore root will have problems writing to that directory.
>
> Trying to start sge_execd on aix - as sge - wont work since the low  
> port
> 537+536 are restricted for users like sge.
>
> This is how it turns out:
>
> sge at ts002.../common >./sgeexecd.test
> + [ 0 -gt 2 -o  = -h -o  = help ]
> + startup=true
> + execd=true
> + qstd=false
> + softstop=false
> + + GetPathToBinaries
> bin_dir=/opt/gridengine/bin/aix53
> + [ /opt/gridengine/bin/aix53 = none ]
> + + GetPathToUtilbin
> utilbin_dir=/opt/gridengine/utilbin/aix53
> + [ /opt/gridengine/utilbin/aix53 = none ]
> + + /opt/gridengine/utilbin/aix53/gethostname -aname
> HOST=ts002.sss.se.scania.com
> + + cut -f1 -d.
> + /opt/gridengine/utilbin/aix53/gethostname -name
> UQHOST=ts002
> + [ true = true ]
> + echo    starting sge_execd
>    starting sge_execd
> + /opt/gridengine/bin/aix53/sge_execd
> error: communication error for "ts002.sss.se.scania.com/execd/1"  
> running
> on port 537: "can't bind socket"
> error: commlib error: can't bind socket (no additional information
> available)
> ..........................
> critical error: abort execd startup due to communication errors
> + [ 1 -eq 0 -a -d /var/lock/subsys ]
> + exit 0
>
> /Erik
>
>
> On tor, 2007-08-30 at 14:45 +0200, Reuti wrote:
>> Am 30.08.2007 um 14:17 schrieb Erik Lönroth:
>>
>>> I'm trying to work out how to get SGE (execd) running as user  
>>> "sge" on
>>> AIX53.
>>>
>>> The process starts OK as user root, but I can't get it to start as
>>> user
>>> "sge". This is most likey due to the sge being prevented to bind to
>>> sockets/ports lower than 1024. We use 537 and this causes problems
>>> when
>>> the process can't bind.
>>>
>>> The solution is -probably- to allow the sge user to open the ports
>>> nessesary - but I dont know how, not being a AIX techy.
>>
>> If it's not running as root, it will only be usable by the user "sge"
>> to use it. Most likely this is not what you want. - Reuti
>>
>>> Any help on this would be much apprechiated.
>>>
>>> /Erik
>>>
>>> -------------------------------------------------------------------- 
>>> -
>>> To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
>>> For additional commands, e-mail: users-help at gridengine.sunsource.net
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
>> For additional commands, e-mail: users-help at gridengine.sunsource.net
>>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
> For additional commands, e-mail: users-help at gridengine.sunsource.net
>

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
For additional commands, e-mail: users-help at gridengine.sunsource.net




More information about the gridengine-users mailing list