[GE users] Plz help with strange shepherd message

Viktor Oudovenko udo at physics.rutgers.edu
Tue May 27 20:46:07 BST 2008


Yes!
Everything is fine with users.
Moreover, in the example I gave below everything runs fine.
I noticed problematic behavior even under my account when I was logged  in
to machine and looked at the case.
v   



> -----Original Message-----
> From: Dan.Templeton at Sun.COM [mailto:Dan.Templeton at Sun.COM] 
> Sent: Tuesday, May 27, 2008 15:40
> To: users at gridengine.sunsource.net
> Subject: Re: [GE users] Plz help with strange shepherd message
> 
> Does the given user exist on that machine?
> 
> Daniel
> 
> Viktor Oudovenko wrote:
> > Daniel,
> >
> > Root can write in any place. This is for sure.
> > The problem is that in directory:
> > /opt/SGE/spool/sub04n157/active_jobs/186117.1
> > There is trace file which belongs to user but in subdirectory 
> > 1.sub04n157 (so the full path is 
> > /opt/SGE/spool/sub04n157/active_jobs/186117.1/1.sub04n157/ trace 
> > belongs to root).
> > And shepherd.XXXX belongs to a user, so it is natural that user can 
> > not right to file which belowns to root.
> > The problem is why does the system try to do it?
> >
> > OK. To be more clrear here is example from another job but 
> it will be 
> > clear seen permissions:
> >
> > 
> [15:14:39]udo at sub04n178:/opt/SGE/spool/sub04n178/active_jobs/186328.1>
> > ls -al total 32 drwxr-xr-x 3 sgeadmin sge  320 2008-05-27 08:58 .
> > drwxr-xr-x 3 sgeadmin sge   72 2008-05-27 08:58 ..
> > drwxr-xr-x 2 sgeadmin sge  256 2008-05-27 08:58 1.sub04n178
> > -rw-r--r-- 1 sgeadmin sge    6 2008-05-27 08:58 addgrpid
> > -rw-r--r-- 1 sgeadmin sge 1793 2008-05-27 08:58 config
> > -rw-r--r-- 1 sgeadmin sge 1577 2008-05-27 08:58 environment
> > -rw-r--r-- 1 camjayi  sge    0 2008-05-27 08:58 error
> > -rw-r--r-- 1 camjayi  sge    0 2008-05-27 08:58 exit_status
> > -rw-r--r-- 1 sgeadmin sge    5 2008-05-27 08:58 job_pid
> > -rw-r--r-- 1 sgeadmin sge 1240 2008-05-27 08:58 pe_hostfile
> > -rw-r--r-- 1 sgeadmin sge    4 2008-05-27 08:58 pid
> > -rw-r--r-- 1 camjayi  sge 4116 2008-05-27 08:58 trace
> >
> > 
> [15:14:43]udo at sub04n178:/opt/SGE/spool/sub04n178/active_jobs/186328.1>
> > ls -l 1.sub04n178/ total 24
> > -rw-r--r-- 1 sgeadmin sge    6 2008-05-27 08:58 addgrpid
> > -rw-r--r-- 1 sgeadmin sge 1891 2008-05-27 08:58 config
> > -rw-r--r-- 1 sgeadmin sge 1845 2008-05-27 08:58 environment
> > -rw-r--r-- 1 root     sge    0 2008-05-27 08:58 error
> > -rw-r--r-- 1 root     sge    0 2008-05-27 08:58 exit_status
> > -rw-r--r-- 1 sgeadmin sge    5 2008-05-27 08:58 job_pid
> > -rw-r--r-- 1 sgeadmin sge    5 2008-05-27 08:58 pid
> > -rw-r--r-- 1 root     sge 2665 2008-05-27 08:58 trace
> > 
> [15:14:51]udo at sub04n178:/opt/SGE/spool/sub04n178/active_jobs/186328.1>
> >
> >
> > So, as you see in the active_jobs directory trace belongs 
> to user . It 
> > is fine . But in subdirectory , in this example : 
> 1.sub04n178 trace is 
> > root owned.
> >
> > And it is general behavior in the system. 
> >
> > Regards,
> > v
> >
> >
> >   
> >> -----Original Message-----
> >> From: Dan.Templeton at Sun.COM [mailto:Dan.Templeton at Sun.COM]
> >> Sent: Tuesday, May 27, 2008 14:47
> >> To: users at gridengine.sunsource.net
> >> Subject: Re: [GE users] Plz help with strange shepherd message
> >>
> >> Check that the host where the file is generated has permission to 
> >> write the to the /opt/SGE/spool/sub04n157/active_jobs directory as 
> >> root.
> >>
> >> Daniel
> >>
> >> Viktor Oudovenko wrote:
> >>     
> >>> HI,
> >>>
> >>> Recently I was playing with jobs suspension and wrote 
> >>> suspension/resume scripts and time after time (very often 
> it is OK) 
> >>> for parallel jobs I see that in /tmp directory every minute
> >>>       
> >> one file
> >>     
> >>> shephherd.XXXX, where XXXX is number is generated. Plz se
> >>>       
> >> below usual content of on of those files.
> >>     
> >>> Plz let me know what might cause such kind of behavior.
> >>>
> >>> shepherd.30448
> >>> ::::::::::::::
> >>> 05/27/2008 02:48:11 [37394:37394 30448]: PANIC:
> >>>
> >>>       
> >> 
> open(/opt/SGE/spool/sub04n157/active_jobs/186117.1/1.sub04n157/trace)
> >>     
> >>> failed: Permission denied
> >>> 05/27/2008 02:48:11 [37394:37394 30448]: PANIC:
> >>>
> >>>       
> >> 
> open(/opt/SGE/spool/sub04n157/active_jobs/186117.1/1.sub04n157/trace)
> >>     
> >>> failed: Permission denied	 
> >>>
> >>> Thank you very much for your help,
> >>> Vic
> >>> P.s. shepherd.XXXX has user permission. User who runs job.
> >>>
> >>>
> >>>
> >>>       
> >> 
> ---------------------------------------------------------------------
> >>     
> >>> To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
> >>> For additional commands, e-mail: 
> users-help at gridengine.sunsource.net
> >>>
> >>>   
> >>>       
> >> 
> ---------------------------------------------------------------------
> >> To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
> >> For additional commands, e-mail: 
> users-help at gridengine.sunsource.net
> >>
> >>     
> >
> >
> > 
> ---------------------------------------------------------------------
> > To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
> > For additional commands, e-mail: users-help at gridengine.sunsource.net
> >
> >   
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
> For additional commands, e-mail: users-help at gridengine.sunsource.net
> 


---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
For additional commands, e-mail: users-help at gridengine.sunsource.net




More information about the gridengine-users mailing list