[GE users] user permission troubles when submitting with qsub

reuti reuti at staff.uni-marburg.de
Sat Dec 20 10:00:22 GMT 2008


    [ The following text is in the "UTF-8" character set. ]
    [ Your display is set for the "ISO-8859-10" character set.  ]
    [ Some characters may be displayed incorrectly. ]

Am 19.12.2008 um 23:29 schrieb Shane Dixon:

> I'm having trouble with an error message regarding permissions.  I  
> have
> an sgeadmin user (uid 4439) that runs the sgemaster on the master

The sgeexecd must be run by root, although it will switch the  
effective user to be your sgeadmin. I.e. you should see:

$ps -e f -o user,ruser,command
USER     RUSER    COMMAND
...
sgeadmin root     /usr/sge/bin/lx24-x86/sge_qmaster
sgeadmin root     /usr/sge/bin/lx24-x86/sge_execd

-- Reuti


> server.  Whenever I submit a job to it as anyone, it tries to  
> create the
> simple.sh.o18 file and then it fails because of permissions.  The .o18
> file goes into the user's home directory.  This happens with all  
> users.
> It looks like it's trying to create the file as sgeadmin instead of as
> the user even though the user submitted it.
>
> I'm not quite sure what's the "proper" way to resolve this?  I tried
> running the sgemaster as root, but the shepherd process still gets
> called as sgeadmin.  How do I work around this?
>
> --
> Shane
>
> Job 23 caused action: Job 23 set to ERROR
>  User        = bsmith
>  Queue       = test at server
>  Host        = server
>  Start Time  = <unknown>
>  End Time    = <unknown>
> failed opening input/output file:12/19/2008 14:42:58 [4439:16105]:
> error: can't open output file "/home/bsmith/simple.sh.o23": Permi
> Shepherd trace:
> 12/19/2008 14:42:58 [4439:16103]: shepherd called with uid = 4439,  
> euid
> = 4439
> 12/19/2008 14:42:58 [4439:16103]: starting up 6.1u5
> 12/19/2008 14:42:58 [4439:16103]: warning: starting not as superuser
> (uid=4439)
> 12/19/2008 14:42:58 [4439:16103]: setpgid(16103, 16103) returned 0
> 12/19/2008 14:42:58 [4439:16103]: no prolog script to start
> 12/19/2008 14:42:58 [4439:16105]: pid=16105 pgrp=16105 sid=16105 old
> pgrp=16103 getlogin()=<no login set>
> 12/19/2008 14:42:58 [4439:16105]: reading passwd information for user
> 'bsmith'
> 12/19/2008 14:42:58 [4439:16103]: forked "job" with pid 16105
> 12/19/2008 14:42:58 [4439:16105]: setosjobid: uid = 4439, euid = 4439
> 12/19/2008 14:42:58 [4439:16103]: child: job - pid: 16105
> 12/19/2008 14:42:58 [4439:16105]: setting limits
> 12/19/2008 14:42:58 [4439:16105]: RLIMIT_CPU setting: (soft 0^HINFINITY
> hard 0^HINFINITY) resulting: (soft 0^HINFINITY hard 0^HINFINITY)
> 12/19/2008 14:42:58 [4439:16105]: RLIMIT_FSIZE setting: (soft  
> 0^HINFINITY
> hard 0^HINFINITY) resulting: (soft 0^HINFINITY hard 0^HINFINITY)
> 12/19/2008 14:42:58 [4439:16105]: RLIMIT_DATA setting: (soft  
> 0^HINFINITY
> hard 0^HINFINITY) resulting: (soft 0^HINFINITY hard 0^HINFINITY)
> 12/19/2008 14:42:58 [4439:16105]: RLIMIT_STACK setting: (soft  
> 0^HINFINITY
> hard 0^HINFINITY) resulting: (soft 0^HINFINITY hard 0^HINFINITY)
> 12/19/2008 14:42:58 [4439:16105]: RLIMIT_CORE setting: (soft  
> 0^HINFINITY
> hard 0^HINFINITY) resulting: (soft 0^HINFINITY hard 0^HINFINITY)
> 12/19/2008 14:42:58 [4439:16105]: RLIMIT_VMEM/RLIMIT_AS setting: (soft
> 0^HINFINITY hard 0^HINFINITY) resulting: (soft 0^HINFINITY hard  
> 0^HINFINITY)
> 12/19/2008 14:42:58 [4439:16105]: RLIMIT_RSS setting: (soft 0^HINFINITY
> hard 0^HINFINITY) resulting: (soft 0^HINFINITY hard 0^HINFINITY)
> 12/19/2008 14:42:58 [4439:16105]: setting environment
> 12/19/2008 14:42:58 [4439:16105]: Initializing error file
> 12/19/2008 14:42:58 [4439:16105]: switching to intermediate/target  
> user
> 12/19/2008 14:42:58 [4439:16105]: tried to change uid/gid without  
> being
> root
> 12/19/2008 14:42:58 [4439:16105]: try running further with uid=4439
> 12/19/2008 14:42:58 [4439:16105]: closing all filedescriptors
> 12/19/2008 14:42:58 [4439:16105]: further messages are in "error" and
> "trace"
> 12/19/2008 14:42:58 [4439:16105]: error: can't open output file
> "/home/bsmith/simple.sh.o23": Permission denied
> 12/19/2008 14:42:58 [4439:16103]: wait3 returned 16105 (status: 6656;
> WIFSIGNALED: 0,  WIFEXITED: 1, WEXITSTATUS: 26)
> 12/19/2008 14:42:58 [4439:16103]: job exited with exit status 26
> 12/19/2008 14:42:58 [4439:16103]: reaped "job" with pid 16105
> 12/19/2008 14:42:58 [4439:16103]: job exited not due to signal
> 12/19/2008 14:42:58 [4439:16103]: job exited with status 26
> 12/19/2008 14:42:58 [4439:16103]: now sending signal KILL to pid  
> -16105
> 12/19/2008 14:42:58 [4439:16103]: no tasker to notify
> 12/19/2008 14:42:58 [4439:16103]: failed starting job
> 12/19/2008 14:42:58 [4439:16103]: no epilog script to start
>
> Shepherd error:
> 12/19/2008 14:42:58 [4439:16105]: error: can't open output file
> "/home/bsmith/simple.sh.o23": Permission denied
>
> Shepherd pe_hostfile:
> server 1 test at server <NULL>
>
> ------------------------------------------------------
> http://gridengine.sunsource.net/ds/viewMessage.do? 
> dsForumId=38&dsMessageId=93447
>
> To unsubscribe from this discussion, e-mail: [users- 
> unsubscribe at gridengine.sunsource.net].

------------------------------------------------------
http://gridengine.sunsource.net/ds/viewMessage.do?dsForumId=38&dsMessageId=93490

To unsubscribe from this discussion, e-mail: [users-unsubscribe at gridengine.sunsource.net].



More information about the gridengine-users mailing list