[GE users] Strange LAM integration problem and qrsh shell question

Dale Harris rodmur at maybe.org
Wed Nov 10 00:13:22 GMT 2004

On Thu, Sep 23, 2004 at 01:20:53PM -0400, Tim Mueller elucidated:
> LAM works on this machine, as does Grid Engine.  However, when I use
> the integration script, I get the debug output shown at the end of
> this message.  Lamboot apparently never hears back from the remote
> lamd agent.  I end up with the following process consuming a CPU:

I don't seem to have the same exact problems, but I'm running into
problems where lamboot is failing apparently due to communication
problems.  I don't have any problems if I run lam from the command line.  

I'm running SGEEE 5.3p6, but I'm running it over bproc (using Vaclav
Hanzl's method, http://noel.feld.cvut.cz/magi/sge+bproc.html) and I'm
trying to get parallel enviroments running.  They seem to work for the
most part, I was able to get PVM and mpich running after a fashion.
However LAM doesn't want to work at all.

My PE setup

pe_name           lammpi
queue_list        all
slots             438
user_lists        NONE
xuser_lists       NONE
start_proc_args   /usr/local/lam-7.1.1/bin/sge-lam start
stop_proc_args    /usr/local/lam-7.1.1/bin/sge-lam stop
allocation_rule   $fill_up
control_slaves    FALSE
job_is_first_task TRUE

I was using a modified version of the startup script, which is
attached.. it really isn't any much different than the perl script for
the LAM integration.   

It just doesn't seem like there should be all that different than the
from the command line... 

Any insights on how to keep lamboot from failing?

Dale Harris   
rodmur at maybe.org

    [ Part 2, Text/PLAIN 185 lines. ]
    [ Unable to print this part. ]

    [ Part 3: "Attached Text" ]

To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
For additional commands, e-mail: users-help at gridengine.sunsource.net

More information about the gridengine-users mailing list