[GE users] Stale finished jobs

Reuti reuti at staff.uni-marburg.de
Wed Dec 5 10:57:05 GMT 2007


    [ The following text is in the "ISO-8859-1" character set. ]
    [ Your display is set for the "ISO-8859-10" character set.  ]
    [ Some special characters may be displayed incorrectly. ]

Quoting Norbert Crettol <norbert.crettol at idiap.ch>:

> Reuti wrote:
>> so the job never ran? I saw this, when the user has no rights to   
>> read the spooled jobscript on the node or it's not created there at  
>>  all. I.e. the "exec" of the fork to be replaced with the actual   
>> jobscript fails. Is the spool directory for the nodes also in   
>> $SGE_ROOT/default/spool/<node>/... or somewhere in /var/spool/sge   
>> local on the node?
> The spool is $SGE_ROOT/default/spool/<node>. There is nothing in

For the qmaster it might be cosmetic to put it in the default location  
or in /var/spool/sge. But for the nodes, it will avoid network traffic  
as the job is first tranferred by SGE to the node, and then written to  
the NFS volume again. Can you change your setup to have the spool  
directory of the nodes local? This might explain the errors from time  
to time.

Also see:

http://gridengine.sunsource.net/howto/nfsreduce.html

-- Reuti

> jobs/
> job_scripts/
> active_jobs/

There is nothing in there also when a job runs?
>
> No message in "messages".
>
> Norbert
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
> For additional commands, e-mail: users-help at gridengine.sunsource.net



---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
For additional commands, e-mail: users-help at gridengine.sunsource.net




More information about the gridengine-users mailing list