[GE users] SGE unable to find my binary program

laikwong laikwong at comp.nus.edu.sg
Thu Mar 23 06:31:24 GMT 2006


Hi,
 
I am trying to place a binary program on my cluster so that users can submit
them as jobs to SGE.
 
However I have a problem getting sge to find the path to the binary program
as well as the path to the shared library needed by the binary program.
 
I am using ROCKS as my cluster management software, it comes pre-installed
with SGE.
 
The environment variables that I have setup are as follows.
 
$PATH = 
/share/apps/bin:/opt/gridengine/bin/lx26-x86:/usr/kerberos/bin:/usr/java/jdk
1.5.0_05/bin:/usr/local/bin:/bin:/usr/bin:/usr/X11R6/bin:
/opt/chromium/bin/Linux:/opt/ganglia/bin:/opt/lam/gnu/bin:/usr/share/pvm3/li
b:/usr/share/pvm3/lib/LINUX:/usr/share/pvm3/bin/LINUX:/opt/rocks/bin:/opt/ro
cks/sbin:/home/trumper/bin
 
$LD_LIBRARY_PATH =
/share/apps/bin:/opt/gridengine/lib/lx26-x86

I have placed the binary program and the shared library into /share/apps/bin
which has been specified in both $PATH and $LD_LIBRARY_PATH. 
 
ROCKS has a native command called cluster-fork which allows you to fork
commands onto all nodes easily. 
 
I am able to do as follows:
cluster-fork oc myinput
 
And I am able to tell that my nodes are actually running my binary program
with the specified input. So far so good as my nodes are able to pickup the
path to the binary program and the shared library.
 
However when I try to do this:
qsub -b y -i myinput oc

Sun grid engine successfully submits my job, however, it later returns me
"oc: Command not found." through stderr.

Any ideas why Sun Grid Engine is unable to pickup my binary program?

Thanks,

Daniel





More information about the gridengine-users mailing list