[GE users] Disappearing hosts/queues with PE's - lam integration

Orion Poplawski orion at cora.nwra.com
Mon Feb 28 21:43:50 GMT 2005


jeroen.m.kleijer at philips.com wrote:
> 
> Hi all,
> 
> I'm trying to get a tight LAM integration going according to post:
> http://gridengine.sunsource.net/servlets/ReadMsg?msgId=21121&listName=users
> 
> I used the sge-lam perl script provided in post:
> http://gridengine.sunsource.net/servlets/ReadMsg?msgId=19278&listName=users
> 
> and modified the qrsh-local sub to have an open filedescriptor before 
> doing the exec($qrsh, at myargs) so my CPU doesn't go to 100% while doing 
> nothin.
> 
> This however, doesn't seem to be enough.
> 
[snip]
> 
> Try invoking the following command at the unix command line:
> 
>         /cadappl/lam/7.1.1-32/bin/sge-lam qrsh-remote nlcftcs13 -n 'echo 
> $SHELL'
> 

I think the problem is with lam-7.1.1 the -nn argument to lamboot no 
longer works when also specifying an ssi configuration, or at least the 
rsh ssi config overrides the -nn argument.  I've fixed in the sge-lam 
script with (note that the first part with the hostname is specific to 
my network but it does affect ARGV):


sub qrsh_remote()
{
   #Fixup hostname - use private GigE on coop machines
   $hostname = shift @ARGV;
   $hostname =~ s/(coop\d\d)[^.]+/$1/;

   #Check for "-n" and remove
   shift @ARGV if $ARGV[0] eq "-n";

   #Put back hostname
   unshift @ARGV,$hostname;
   @myargs=("-inherit","-nostdin","-V", at ARGV);

   debug_print("QRSH REMOTE CONFIG: @myargs");
   #if($debug){ close(SGEDEBUG); }
   exec($qrsh, at myargs);
}

sge-lam calls lamboot with:

-nn -ssi boot rsh -ssi boot_rsh_agent /usr/libexec/lam/sge-lam 
qrsh-remote -c sge-lam-conf.lamd -d /tmp/38188.1.coop.q/lamhostfile

But debug shows:

n-1<27171> ssi:boot:rsh:no_n: 0

i.e. - the -nn option has been overridden.  Then qrsh complains about 
the -n option and the house of cards collapses.

-- 
Orion Poplawski
System Administrator                   303-415-9701 x222
Colorado Research Associates/NWRA      FAX: 303-415-9702
3380 Mitchell Lane, Boulder CO 80301   http://www.co-ra.com

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
For additional commands, e-mail: users-help at gridengine.sunsource.net




More information about the gridengine-users mailing list