[GE users] New grid problems

Chris Dagdigian dag at sonsorol.org
Sat Jan 26 03:46:01 GMT 2008


Hi Robert,

Is Grid Engine running your job in the same shell that you are using  
when you run the job via the command line? You can explicitly request  
a certain shell by adding "-S /bin/sh" or similar to your submission  
command. Depending on how SGE is configured it may ignoring the first  
line of your job script (the "#!/bin/something" part ...).

One technique I occasionally will use is to create a simple shell  
script that does nothing but print the current path and ENV variables  
to the standard output. I then submit that to grid engine and compare  
the job output to my local path and environment.

Regardless, Rayson has you on the right path. These sorts of "jobs run  
manually but not via SGE" problems are almost always due to shell,  
environment, path or permission issues.  When you figure out what is  
"different" about the two environments you'll have the answer to your  
problem.


-Chris


On Jan 25, 2008, at 8:18 PM, Robert White wrote:

> Hi Rayson,
>
> This is my library path after sshing directly into sisko.
> ##
> [robertw at sisko robertw]echo $LD_LIBRARY_PATH
> /tools/se/NEC/CSR_be90/ARM_DSM/32bit/latest/ 
> DSM_NBARM926C1616T00P9V10_lic_CB90M_ncverilog_Linux-32_1.0/ 
> simulation_models//ModelManager/MMAPI_5.0.1/Linux/MM/ 
> cadence_nc_verilog:/tools/novas/verdi/latest/share/PLI/systemc/ 
> ncsc53/lib-linux_gcc3_23:/tools/novas/verdi/latest/share/FsdbWriter/ 
> LINUX:/tools/novas/verdi/latest/share/PLI/nc51/LINUX/nc_loadpli1:/ 
> tools/cadence/ius58s3/tools/tbsc/lib/gnu:/tools/cadence/ius58s3/ 
> tools/systemc/gcc/3.2.3/install/lib:.:/tools/cadence/ius58s3//tools/ 
> inca/lib:/tools/cadence/ius58s3//tools/lib:/tools/cadence/ius58s3// 
> tools/ict/Linux/pli/ncv1_21:/usr/lib:/usr/local/lib:/tools/ActiveTcl/ 
> lib:/tools/denali/denali_3.2.050/verilog:/tools/denali/ 
> denali_3.2.050/ddvapi:/tools/vera/vera-6.3.10-linux2.4.7/lib:/tools/ 
> cadence/ius58s3/tools/systemc/gcc/3.2.3/install/lib
> ###
>
> This is my library path after qlogining into sisko
> ###
> [robertw at sisko robertw]echo $LD_LIBRARY_PATH
> /tools/se/NEC/CSR_be90/ARM_DSM/32bit/latest/ 
> DSM_NBARM926C1616T00P9V10_lic_CB90M_ncverilog_Linux-32_1.0/ 
> simulation_models//ModelManager/MMAPI_5.0.1/Linux/MM/ 
> cadence_nc_verilog:/tools/novas/verdi/latest/share/PLI/systemc/ 
> ncsc53/lib-linux_gcc3_23:/tools/novas/verdi/latest/share/FsdbWriter/ 
> LINUX:/tools/novas/verdi/latest/share/PLI/nc51/LINUX/nc_loadpli1:/ 
> tools/cadence/ius58s3/tools/tbsc/lib/gnu:/tools/cadence/ius58s3/ 
> tools/systemc/gcc/3.2.3/install/lib:.:/tools/cadence/ius58s3//tools/ 
> inca/lib:/tools/cadence/ius58s3//tools/lib:/tools/cadence/ius58s3// 
> tools/ict/Linux/pli/ncv1_21:/usr/lib:/usr/local/lib:/tools/ActiveTcl/ 
> lib:/tools/denali/denali_3.2.050/verilog:/tools/denali/ 
> denali_3.2.050/ddvapi:/tools/vera/vera-6.3.10-linux2.4.7/lib:/tools/ 
> cadence/ius58s3/tools/systemc/gcc/3.2.3/install/lib
> ###
>
> A diff of the two doesn't show any differences.
>
> Bob
>
>
> On Jan 25, 2008 6:47 PM, Rayson Ho <rayrayson at gmail.com> wrote:
> Can you check the LD_LIBRARY_PATH difference between your shell and
> the SGE job environment??
>
> Rayson
>
>
>
> On Jan 25, 2008 7:29 PM, Robert White <alphamonk at gmail.com> wrote:
> >
> > Hi All,
> >
> > I have a job script that runs correctly when I run the application  
> from the
> > command line on a execution host without sending the app through  
> SGE. If I
> > run the job through grid either using qrsh, qsub or qlogin, I  
> receive the
> > same error. The error complains about a library file that. This is  
> the error
> > message. "libdenpli.so: failed cannot open shared object file: No  
> such file
> > or directory or file is not valid ELFCLASS32 library"
> >
> > When I run the command on an execution host and use strace to see  
> if find
> > this library this is what I see.
> > open("/tools/denali/denali_3.2.050/verilog/libdenpli.so",  
> O_RDONLY) = 10
> > read(10, "\177ELF 
> \1\1\1\0\0\0\0\0\0\0\0\0\3\0\3\0\1\0\0\0\300\275"..., 512)
> > = 512
> > fstat64(10, {st_mode=S_IFREG|0755, st_size=19806385, ...}) = 0
> >  old_mmap(NULL, 27237300, PROT_READ|PROT_EXEC, MAP_PRIVATE, 10, 0) =
> > 0x32fae000
> > old_mmap(0x33fc4000, 1052672, PROT_READ|PROT_WRITE, MAP_PRIVATE| 
> MAP_FIXED,
> > 10, 0x1015000) = 0x33fc4000
> > old_mmap(0x340c5000, 9317300, PROT_READ|PROT_WRITE,
> > MAP_PRIVATE|MAP_FIXED|MAP_ANONYMOUS, -1, 0) = 0x340c5000
> >  close(10)                               = 0
> >
> >
> >
> > When I do a qlogin and ran the command by hand using strace and  
> found this
> > at the point where the library file is being accessed.
> > open("/tools/denali/denali_3.2.050/verilog/libdenpli.so",  
> O_RDONLY) = 10
> > read(10, "\177ELF 
> \1\1\1\0\0\0\0\0\0\0\0\0\3\0\3\0\1\0\0\0\300\275"..., 512)
> > = 512
> > fstat64(10, {st_mode=S_IFREG|0755, st_size=19806385, ...}) = 0
> >  old_mmap(NULL, 27237300, PROT_READ|PROT_EXEC, MAP_PRIVATE, 10, 0)  
> = -1
> > ENOMEM (Cannot allocate memory)
> > close(10)                               = 0
> >
> > Does anyone know what this problem could be? Are there any memory
> > limitations that I have someone that I am not aware of. This is a  
> new
> > install of SGE 6.1u2 running on RHEL3.9 computers.
> >
> > Thanks Bob
> >
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
> For additional commands, e-mail: users-help at gridengine.sunsource.net
>
>


---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
For additional commands, e-mail: users-help at gridengine.sunsource.net




More information about the gridengine-users mailing list