[GE users] can't open output file error

apseyed at bu.edu apseyed at bu.edu
Thu Jul 29 10:04:56 BST 2004


Hi,

Using SGE 5.3p5 as part of Rocks 3.1..

Sometimes when i submit a mass number of jobs (say 200-300 jobs where 250 or 
so slots are available), a number of the jobs end up in "Eqw" and the 
following type message appears in messages on the qmaster: 

Thu Jul 29 04:18:04 2004|qmaster|linga|W|job 19347.1 failed on host c
ompute-1-13.local general  opening output file because: 07/29/2004 04
:18:02 [501:5855]: error: can't open output file "/home/apseyed/scrip
ts/dd": Is a directory

Note that the following parameter is set in the scripts "#$ -cwd" and the jobs 
are being launched from "/home/apseyed/scrips/dd"
I find this message strange and that it only occurs for some of the jobs 
although the scripts are identical in behavior; could be a result of network 
throttling but I'm not so sure. Any ideas on the error message specifically or 
otherwise?

Cheers,
Patrice

P.S. While I'm asking what does this type messsage indicate? (i see it from 
time to time in the sge logs, although non-fatal curious as to why it appears):

Thu Jul 29 04:51:53 2004|qmaster|<hostname>|E|can not remove job spool file: 
zombies/00/0001/9349
Thu Jul 29 04:51:53 2004|qmaster|<hostname>|E|ERROR: 
unlinking "zombies/00/0001/9350": No such file or directory


---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
For additional commands, e-mail: users-help at gridengine.sunsource.net




More information about the gridengine-users mailing list