[GE users] Loose Integration LAM using ssh Sun Grid engine

Reuti reuti at staff.uni-marburg.de
Wed Apr 5 10:48:51 BST 2006


Joerg,

this looks fine, so I have no clue what's the problem in your case.

The last sentence might be a misunderstanding: don't use  
ENABLE_ADDGRP_KILL with a tight LAM integration.

-- Reuti


Am 05.04.2006 um 08:11 schrieb Joerg Reichel:

>
> Hi Reuti,
> thanx for your answer:
>
>> the lamhalt isn't working, as (the local) daemon isn't working any
>> longer. As you stated, that lamnodes in the jobscript is already not
>> working, can you put just a:
>>
>> ps -e f -o pid,ppid,pgrp,command
>
> i did and this is what i found in the output file:
>
> LAM 7.1.1/MPI 2 C++/ROMIO - Indiana University
>
>   PID  PPID  PGRP COMMAND
>     1     0     0 init [3]
>     2     1     0 [ksoftirqd/0]
>     3     1     0 [events/0]
>     4     1     0 [khelper]
>     5     1     0 [kthread]
>     7     5     0  _ [kblockd/0]
>     8     5     0  _ [kacpid]
>    83     5     0  _ [pdflush]
>    84     5     0  _ [pdflush]
>    86     5     0  _ [aio/0]
>   669     5     0  _ [kseriod]
>   699     5     0  _ [scsi_eh_0]
>   744     5     0  _ [kpsmoused]
>  1377     5     0  _ [reiserfs/0]
>  1664     5     0  _ [khubd]
>  2567     5     0  _ [rpciod/0]
>    85     1     1 [kswapd0]
>   757     1     1 [kjournald]
>   768     1     1 [afs_rxlistener]
>  770     1     1 [afs_callback]
>   772     1     1 [afs_rxevent]
>   774     1     1 [afsd]
>   776     1     1 [afs_checkserver]
>   778     1     1 [afs_background]
>   780     1     1 [afs_background]
>   782     1     1 [afs_background]
>   784     1     1 [afs_cachetrim]
>   878     1   878 udevd
>  1387     1     1 [kjournald]
>  2526     1  2526 /sbin/portmap
>  2573     1     1 [lockd]
>  2795     1  2795 /sbin/syslog-ng -p /var/run/syslog-ng.pid
>  2829     1  2828 /usr/sbin/ypbind
>  2843     1  2843 /usr/sbin/inetd
>  2855     1  2855 lpd Waiting
>  2863     1  2863 /usr/sbin/rwhod -b
>  2864  2863  2863  _ /usr/sbin/rwhod -b
>  2912     1  2912 /usr/local/grid/sge6.0/bin/lx24-x86/sge_execd
>  3347  2912  3347  _ sge_shepherd-89 -bg
>  3387  3347  3387      _ -csh
> /usr/local/grid/sge6.0/frz/spool/ppc207/job_scrip
> ts/89
>  3392  3387  3387          _ ps -e f -o pid,ppid,pgrp,command
>  2923     1  2923 /usr/sbin/sshd
>  2932     1  2932 /usr/bin/X11/xfs -daemon
>  3034     1  2704 /bin/bash /etc/rc3.d/S20xprint start
>  3035  3034  2704  _ /bin/bash /etc/rc3.d/S20xprint start
>  3039  3035  2704  |   _ /usr/X11R6/bin/Xprt -ac -pn -nolisten tcp - 
> audit
> 4 -fp
>
> /usr/X11R6/lib/X11/fonts/Type1,/usr/lib/X11/fonts/Type1,/usr/lib/ 
> X11/fonts/T
> ype
> 1/,/var/lib/defoma/x-ttcidfont-conf.d/dirs/TrueType,/usr/X11R6/lib/ 
> X11/fonts
> /100
> dpi,/usr/X11R6/lib/X11/fonts/75dpi,/usr/X11R6/lib/X11/fonts/misc,/ 
> usr/lib/X1
> 1/fo
> nts/100dpi,/usr/lib/X11/fonts/100dpi/,/usr/lib/X11/fonts/75dpi,/usr/ 
> lib/X11/
> font
> s/75dpi/,/usr/lib/X11/fonts/misc,/usr/lib/X11/fonts/misc/ :64
>  3040  3034  2704  _ /bin/bash /etc/rc3.d/S20xprint start
>  3048     1  3048 /usr/sbin/ntpd -p /var/run/ntpd.pid
>  3056     1  3056 /usr/sbin/atd
>  3060     1  3060 /usr/sbin/cron
>  3146     1  3146 /usr/bin/kdm
>  3150  3146  3150  _ /usr/X11R6/bin/X -nolisten tcp -auth
> /var/run/xauth/A:0-aF
> NR5J vt7
>  3257  3146  3146  _ -:0
>  3291  3257  3146      _ /usr/bin/kdm_greet
>  3159     1  3159 /sbin/getty 38400 tty1
>  3161     1  3161 /sbin/getty 38400 tty2
>  3162     1  3162 /sbin/getty 38400 tty3
>  3163     1  3163 /sbin/getty 38400 tty4
>  3164     1  3164 /sbin/getty 38400 tty5
>  3173     1  3173 /sbin/getty 38400 tty6
>  3382     1  3380 /usr/bin/lamd -H 141.35.13.107 -P 39867 -n 0 -o
> 0 -d -sessions
> uffix sge-89-undefined
>  3391  3382  3380  _ /usr/bin/lamd -H 141.35.13.107 -P 39867 -n 0 -o
> 0 -d -sess
> ionsuffix sge-89-undefined
>
>> Can you see something out of it?
>>
>> Did you activated by accident the option in qconf -mconf:
>>
>> execd_params ENABLE_ADDGRP_KILL
>>
>> which could break the forking of the processes into daemon-land?
>
> now activated it
>
> Regards Joerg
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
> For additional commands, e-mail: users-help at gridengine.sunsource.net

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
For additional commands, e-mail: users-help at gridengine.sunsource.net




More information about the gridengine-users mailing list