[GE users] jobs never die on nodes with mpich

Michel Cuendet michel.cuendet at epfl.ch
Thu Aug 12 15:47:32 BST 2004


    [ The following text is in the "ISO-8859-1" character set. ]
    [ Your display is set for the "ISO-8859-10" character set.  ]
    [ Some special characters may be displayed incorrectly. ]


Hi,

It's a bit of everything...

On the master node :

root      2432  0.0  0.0  3904  636 ?        S    Jul19   4:06 
/opt/sge/bin/glinux/sge_commd
sgeadmin  2434  0.2  0.1  5860 2328 ?        S<   Jul19  70:55 
/opt/sge/bin/glinux/sge_execd
sgeadmin 26839  0.0  0.0  1852  668 ?        S<   12:48   0:00  \_ 
sge_shepherd-5107 -bg
mitch    27006  0.0  0.0  2056  960 ?        S    12:48   0:00  |   \_ 
bash /opt/sge/default/spool/node27/job_scripts/5107
mitch    27009  0.0  0.1  4300 2728 ?        S    12:48   0:00  |       
\_ perl -S -w /opt/mpich/1.2.5..10/ia32/ic71/bin/mpirun.ch_gm
mitch    27038  0.0  0.1  4372 2796 ?        S    12:48   0:00  
|           \_ perl -S -w /opt/mpich/1.2.5..10/ia32/ic71/bin/mpirun.c
mitch    27039  0.0  0.0  2472 1000 ?        S    12:48   0:00  
|           \_ /opt/sge/bin/glinux/qrsh -inherit node27 cd /home/mitc
mitch    27188  0.0  0.0  1548  584 ?        S    12:48   0:00  
|           |   \_ /opt/sge/utilbin/glinux/rsh -p 42579 node27.cluste
mitch    27203  0.0  0.0     0    0 ?        Z    12:48   0:00  
|           |       \_ [rsh <defunct>]
mitch    27040  0.0  0.0  2464 1000 ?        S    12:48   0:00  
|           \_ /opt/sge/bin/glinux/qrsh -inherit -nostdin node26 cd /
mitch    27200  0.0  0.0  1544  580 ?        S    12:48   0:00  
|           |   \_ /opt/sge/utilbin/glinux/rsh -n -p 41869 node26.clu
mitch    27041  0.0  0.0  2464 1000 ?        S    12:48   0:00  
|           \_ /opt/sge/bin/glinux/qrsh -inherit -nostdin node18 cd /
mitch    27201  0.0  0.0  1544  576 ?        S    12:48   0:00  
|           |   \_ /opt/sge/utilbin/glinux/rsh -n -p 41927 node18.clu
sgeadmin 27179  0.0  0.0  1856  664 ?        S<   12:48   0:00  \_ 
sge_shepherd-5107 -bg
root     27180  0.0  0.0  1864  596 ?        S    12:48   0:00      \_ 
/opt/sge/utilbin/glinux/rshd -l
mitch    27202  0.0  0.0  1640  428 ?        S    12:48   0:00          
\_ /opt/sge/utilbin/glinux/qrsh_starter /imports/sge/default/
mitch    27257 93.6 10.8 345140 224516 ?     R    12:48 
221:57              \_ /home/mitch/QMMM_38/cpmd.x input /home/mitch/PP


Bogdan Costescu wrote:

>On Thu, 12 Aug 2004, Reuti wrote:
>
>  
>
>>All is working fine, as long as the started process by qrsh on the
>>slave is just one command/program only. ... (I red the hint on the
>>list, that the created bash has a new process group).
>>    
>>
>
>Does SGE run as a user or as root ?
>When running as root, SGE should use the additional group feature to 
>identify processes started by itself and this cannot be circumvented 
>by a user program as only root can set such additional groups.
>
>  
>



---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
For additional commands, e-mail: users-help at gridengine.sunsource.net




More information about the gridengine-users mailing list