[GE users] Weird issue with loose LAM/rsh integration

Reuti reuti at staff.uni-marburg.de
Mon Sep 22 09:37:46 BST 2008


Hi,

Am 22.09.2008 um 06:23 schrieb Joshua Baker-LePain:

> I'm running SGE 6.1u5 on top of CentOS 4.6 and the included lam-7.1.2.

below you mention 7.0.3 - can it be, that you have installed both  
versions and intermixing them (maybe the older one was installed with  
CentOS already)?

-- Reuti

> I'm attempting to set up loose integration using RSH as detailed  
> here <http://gridengine.sunsource.net/howto/lam-integration/lam- 
> integration.html>. Everything seems to work *except* actually  
> running the MPI program.  The lamboot works (I can see lamd on the  
> node, and 'lamnodes' within the job script returns the proper  
> output), but upon attempting to 'lamrun' my sample program I get  
> the output below.  Note that this program runs just fine outside of  
> SGE.  In fact, I can even login to the node with the SGE started  
> lamd and successfully run this program.  Any ideas?  Thanks!
>
> MPI program output:
> *** Oops -- I cannot open the LAM help file.
> *** I tried looking for it in the following places:
> ***
> *** Oops -- I cannot open the LAM help file.
> *** I tried looking for it in the following places:
> ***
> ***   $HOME/lam-helpfile
> ***   $HOME/lam-helpfile
> ***   $HOME/lam-7.0.3-helpfile
> ***   $HOME/lam-7.0.3-helpfile
> ***   $HOME/etc/lam-helpfile
> ***   $HOME/etc/lam-helpfile
> ***   $HOME/etc/lam-7.0.3-helpfile
> ***   $LAMHELPDIR/lam-helpfile
> ***   $HOME/etc/lam-7.0.3-helpfile
> ***   $LAMHELPDIR/lam-7.0.3-helpfile
> ***   $LAMHOME/etc/lam-helpfile
> ***   $LAMHELPDIR/lam-helpfile
> ***   $LAMHOME/etc/lam-7.0.3-helpfile
> ***   $SYSCONFDIR/lam-helpfile
> ***   $LAMHELPDIR/lam-7.0.3-helpfile
> ***   $SYSCONFDIR/lam-7.0.3-helpfile
> ***
> *** You were supposed to get help on the program "MPI"
> ***   $LAMHOME/etc/lam-helpfile
> *** about the topic "no-lamd"
> ***
> *** Sorry!
> ---------------------------------------------------------------------- 
> -------
> ***   $LAMHOME/etc/lam-7.0.3-helpfile
> ***   $SYSCONFDIR/lam-helpfile
> ***   $SYSCONFDIR/lam-7.0.3-helpfile
> ***
> *** You were supposed to get help on the program "MPI"
> *** about the topic "no-lamd"
> ***
> *** Sorry!
> ---------------------------------------------------------------------- 
> -------
> ---------------------------------------------------------------------- 
> -------
> It seems that [at least] one of the processes that was started with
> mpirun did not invoke MPI_INIT before quitting (it is possible that
> more than one process did not invoke MPI_INIT -- mpirun was only
> notified of the first one, which was on node n0).
>
> mpirun can *only* be used with MPI programs (i.e., programs that
> invoke MPI_INIT and MPI_FINALIZE).  You can use the "lamexec" program
> to run non-MPI programs over the lambooted nodes.
> ---------------------------------------------------------------------- 
> -------
>
> -- 
> Joshua Baker-LePain
> QB3 Shared Cluster Sysadmin
> UCSF
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
> For additional commands, e-mail: users-help at gridengine.sunsource.net


---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
For additional commands, e-mail: users-help at gridengine.sunsource.net




More information about the gridengine-users mailing list