[GE users] permissions on directories...

Reuti reuti at staff.uni-marburg.de
Wed May 17 22:26:53 BST 2006


Hi,

Am 17.05.2006 um 17:45 schrieb davide cittaro:

> Hi. Well, actually the script should process <2Mb files... By the  
> way I could launch them from my home directory and everything seems  
> working...
>
> $ mount
> /home type nfs (rw,nosuid,rsize=16384,wsize=16384,addr=  
> 85.239.175.2) <= here works
> /data type nfs  
> (rw,nosuid,rsize=16384,wsize=16384,addr=85.239.175.2)  <= here doesn't

usually the job script is first copied to the spool directory  
$SGE_ROOT/spool/qmaster/job_scripts

Can you submit the job with -h to check before it run, whether the  
script is copied to this location in the correct way?

When the job starts, it will be copied by SGE with its own protocol  
to the spool directory of and on the node - sometimes this is again  
back via NFS to the master node - seems to be in your setup also. I  
prefer having the spool directory of the exec host local on it.

Cheers - Reuti


> Also, another GE cluster, mounting the same directories in the same  
> way works.
>
> d
>
> On 5/17/06, Reuti <reuti at staff.uni-marburg.de> wrote: Am 17.05.2006  
> um 16:18 schrieb davide cittaro:
>
> > Hi there, I'm still investigating on the permission error... It
> > seems that permission mode doesn't explain the error... what I can
> > say now is:
> > 1- I launch the script from a directory X it goes in Eqw state
> > 2- I copy that script in my home it runs
> > 3- I copy a SGE example script in directory X it runs
> > 4- I run the script from directory X but on another cluster grid
> > (that shares the directory X)... it runs...
> >
> > the messages:
> >
> > 05/17/2006 16:13:48|execd|ia32|E|shepherd of job 430.1 exited with
> > exit status = 27
>
> Are you on Linux? Error 27 means file too large. Is it a big file? -
> Reuti
>
> > 05/17/2006 16:13:48|execd|ia32|W|reaping job "430" ptf complains:
> > Job does not exist
> > 05/17/2006 16:13:48|execd|ia32|E|can't open usage file "active_jobs/
> > 430.1/usage" for job 430.1: No such file or directory
> > 05/17/2006 16:13:48|execd|ia32|E|05/17/2006 16:13:48 [2486:4613]:
> > execvp(/opt/sge6/bioinfo/spool/ia32/job_scripts/430, "/opt/sge6/
> > bioinfo/spool/ia32/job_scripts/430") failed: No such file or  
> directory
> >
> > Any hint?
> >
> > d
> >
> > On 5/10/06, Reuti <reuti at staff.uni-marburg.de> wrote: Am 10.05.2006
> > um 11:48 schrieb davide cittaro:
> >
> > > Okay, now the question: is the directory /opt/sge6/bioinfo/spool/
> > > alpha2 readable/executable by all users on the execution host(s)?
> > The
> > > scripts are run directly from this location.
> > >
> > > Well, the permissions are 755 and other users (with 775, 771) on
> > > their homes can run stuff... The users are stored in a LDAP server
> > > that runs remotely... can be this a problem?
> >
> > Are you sure that this is the only difference - can you try to  
> change
> > the permission settings and see the result? My users also have
> > different protection settings (700 or 755) and I don't face any
> > problems with this setup.
> >
> > What Linux are you using? In Debian AFAIK the default is that each
> > user is his own group, in contrast to SuSE, where all are in one
> > default group. So, what are the detailed protection settings of opt/
> > sge6/bioinfo/spool/alpha2? How is your primary/secondary group setup
> > of your users? I have access to a Debian cluster with an external
> > LDAP and all is working smoothly.
> >
> > -- Reuti
> >
> > > d
> > >
> > >
> > >
> > > -- Reuti
> > >
> > >
> > > > the -cwd option should save him from such errors, shouldn't it?
> > > >
> > > > d
> > > >
> > > > On 5/10/06, Reuti <reuti at staff.uni-marburg.de > wrote:Hi,
> > > >
> > > > Am 10.05.2006 um 10:50 schrieb davide cittaro:
> > > >
> > > > > Hi all, I've seen that some users cannot run GE and the jobs
> > > are in
> > > > > Eqw state, but qstat -j doesn't say anything at all...
> > > > > I don't know which permissions they have on the cwd, where  
> they
> > > > > launch jobs, but they have 700 on their /home... Is it  
> possible
> > > > > that this somehow interferes with GE?
> > > >
> > > >
> > > > no, this should work. Is there any hint in any of the messages
> > files
> > > > $SGE_ROOT/spool/qmaster/messages or the one from the node where
> > the
> > > > job tried to start? Anything in the standard-out/-err files?
> > > >
> > > > -- Reuti
> > > >
> > > >
> > > > > Thanks
> > > > >
> > > > > d
> > > >
> > > >
> > >
> >  
> ---------------------------------------------------------------------
> > > > To unsubscribe, e-mail: users- 
> unsubscribe at gridengine.sunsource.net
> > > > For additional commands, e-mail: users-
> > help at gridengine.sunsource.net
> > > >
> > > >
> > >
> > >
> >  
> ---------------------------------------------------------------------
> > > To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
> > > For additional commands, e-mail: users- 
> help at gridengine.sunsource.net
> > >
> > >
> >
> >  
> ---------------------------------------------------------------------
> > To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
> > For additional commands, e-mail: users-help at gridengine.sunsource.net
> >
> >
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
> For additional commands, e-mail: users-help at gridengine.sunsource.net
>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
For additional commands, e-mail: users-help at gridengine.sunsource.net




More information about the gridengine-users mailing list