[GE issues] 6_2U3 issue/question

gtatachar gopinath.tatachar at gs.com
Fri Jan 15 14:30:43 GMT 2010


    [ The following text is in the "Windows-1252" character set. ]
    [ Your display is set for the "ISO-8859-10" character set.  ]
    [ Some special characters may be displayed incorrectly. ]

Hi,

I am running 6_2U3 grid engine and occasionally see the following error:

Shepherd error:
01/13/2010 19:22:09 [3012:13075]: error: can't open output file "/home/qeoptuat/grid/20100113_191437_589963000_EST_nyqspla126v.ny.gsam.gs.com_932/10.out.txt": No such file or directory

The entry in the messages file on the spool has the following entry:

01/13/2010 19:22:10|  main|nyqspla116v|E|shepherd of job 35971.10 exited with exit status = 26
01/13/2010 19:22:10|  main|nyqspla116v|E|can't open usage file "active_jobs/35971.10/usage" for job 35971.10:
No such file or directory
01/13/2010 19:22:10|  main|nyqspla116v|E|01/13/2010 19:22:09 [3012:13075]: error: can't open output file "/hom
e/qeoptuat/grid/20100113_191437_589963000_EST_nyqspla126v.ny.gsam.gs.com_932/10.out.txt": No such file or dire
ctory

Can anyone shed more light on exit status = 26? Is this somehow related to NFS? ( I am using NFS spooling)

Also to note is

 1.  the fact that the source of the error seems to be due to a long delay between when a child job terminates from the grid, and when the parent job is able to locate the child job?s output file on NFS.
 2.  We never experienced this on n1ge6 grid using Berkeley db.


Any information is greatly appreciated.

Thanks
Gopi






More information about the gridengine-users mailing list