[GE users] SGE6.1 error

Reuti reuti at staff.uni-marburg.de
Mon Aug 13 09:52:52 BST 2007


Hi,

Am 13.08.2007 um 09:56 schrieb John_Tai:

> I have recently installed 6.1, but every job is terminated after a  
> while.
>
> This is my job from qstat, started as "qrsh -v eda=$cmd -cwd -now n  
> icfb":
>
>     950 0.55500 icfb       johnt        r     08/13/2007 14:48:02  
> layout.q at dsl46
>
> Here is the message I get from the command line:
>
>     error: error reading returncode of remote command
>
> This is the qmaster messages:
>
>     08/13/2007 15:03:34|qmaster|dsls11|W|job 950.1 failed on host  
> dsl46 general before job because: 08/13/2007 15:03:31 [999:20475]:  
> can't open file /tmp/950.1.layout.q/pid: Permission denied
>
> This is the exec host messages:
>
>     08/13/2007 15:03:31|execd|dsl46|E|shepherd of job 950.1 exited  
> with exit status = 11
>
> Looking at the qmaster messages, it seems that this happens every  
> hour to the majority of jobs. It doesn't seem to be bound by user  
> nor exec host.
>
> Hope somebody can help me. I had been using 6.0u7-1 for a long time  
> without problems, but now that I changed qmaster server and  
> installed the latest version, I keep getting this problem.

if it's just every hour: is there a cronjob for cleaning /tmp  
running? - Reuti

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
For additional commands, e-mail: users-help at gridengine.sunsource.net




More information about the gridengine-users mailing list