[GE users] Problems with LAM tight integration

slaton slaton at berkeley.edu
Wed Aug 2 23:55:13 BST 2006


> > > > -catch_rsh /opt/sge/default/spool/qcn13/active_jobs/50.1/pe_hostfile
> > > > qcn13
> > > > qcn14
> > > > qcn15
> > > > qcn16
> > > > /opt/sge/bin/lx24-amd64/qrsh -V -inherit -n -p 32795 qcn13 exec
> 
> I still wonder, where this line is printed. The option -n is wrong 
> there. In your original post there were two additonal lines: are these 
> from your script?

i added some debug statements to the script, which indicated that 
startlam.sh generated this statement:

 -catch_rsh /opt/sge/default/spool/qcn11/active_jobs/59.1/pe_hostfile 

whereas the rsh wrapper generated this (note additional lines not included 
above):

 /opt/sge/bin/lx24-amd64/qrsh -V -inherit -n -p 32796 qcn11 exec 
 '/opt/sge/utilbin/lx24-amd64/qrsh_starter' 
 '/opt/sge/default/spool/qcn11/active_jobs/59.1/1.qcn11'

the former (startlam.sh) comes from the script line:

 # useful to control parameters passed to us  
 echo $*

that seems fine.
the latter line (rsh wrapper) comes from the line:

 echo $SGE_ROOT/bin/$ARC/qrsh -V -inherit $rhost $cmd
 exec $SGE_ROOT/bin/$ARC/qrsh -V -inherit $rhost $cmd

this is triggered by the condition $minus_n != 1.

i'm perplexed as to how..

 $rhost $cmd

expands to..

 -n -p 32795 qcn15 exec '/opt/sge/utilbin/lx24-amd64/qrsh_starter' 
 '/opt/sge/default/spool/qcn15/active_jobs/61.1/1.qcn15'

thanks
slaton

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
For additional commands, e-mail: users-help at gridengine.sunsource.net




More information about the gridengine-users mailing list