[GE users] lam tight and host names

Davide Cittaro davide.cittaro at ifom-ieo-campus.it
Wed Feb 28 16:27:04 GMT 2007


Live running:


$ cat mpitest.sh
#!/bin/sh

#$ -S /bin/bash
#$ -cwd
#$ -q all.q@@ia32
#$ -N MPIHELLO

mpirun C ./mpihello


$ qsub -pe lam_mpi 10 ./mpitest.sh
Your job 3179 ("MPIHELLO") has been submitted

$ qstat -f -ne -g t
queuename                      qtype used/tot. load_avg arch           
states
------------------------------------------------------------------------ 
----
all.q at ia32.bioinfo.ifom-ieo-ca BIP   1/2       0.00     lx26-x86
    3182 1.05500 MPIHELLO   dcittaro     r     02/28/2007 17:18:46 SLAVE
------------------------------------------------------------------------ 
----
all.q at node1.xtal.ifom-ieo-camp BIP   1/4       0.00     lx26-x86
    3182 1.05500 MPIHELLO   dcittaro     r     02/28/2007 17:18:46 SLAVE
------------------------------------------------------------------------ 
----
all.q at node2.xtal.ifom-ieo-camp BIP   2/4       0.00     lx26-x86
    3182 1.05500 MPIHELLO   dcittaro     r     02/28/2007 17:18:46  
MASTER
                                                                   SLAVE
                                                                   SLAVE
------------------------------------------------------------------------ 
----
all.q at node3.xtal.ifom-ieo-camp BIP   2/4       0.00     lx26-x86
    3182 1.05500 MPIHELLO   dcittaro     r     02/28/2007 17:18:46 SLAVE
                                                                   SLAVE
------------------------------------------------------------------------ 
----
all.q at node4.xtal.ifom-ieo-camp BIP   1/4       0.00     lx26-x86
    3182 1.05500 MPIHELLO   dcittaro     r     02/28/2007 17:18:46 SLAVE
------------------------------------------------------------------------ 
----
all.q at node5.xtal.ifom-ieo-camp BIP   2/4       0.00     lx26-x86
    3182 1.05500 MPIHELLO   dcittaro     r     02/28/2007 17:18:46 SLAVE
                                                                   SLAVE
------------------------------------------------------------------------ 
----
all.q at special.xtal.ifom-ieo-ca BIP   1/4       0.00     lx26-x86
    3182 1.05500 MPIHELLO   dcittaro     r     02/28/2007 17:18:46 SLAVE


$ for i in ia32 xnode1 xnode2 xnode3 xnode4 xnode5 xspecial
 > do
 > rsh $i ps -e f -o pid,ppid,pgrp,command --cols=80
 > done

[ia32.bioinfo]
   PID  PPID  PGRP COMMAND
<snip>
4575     1  4575 /opt/sge/bin/lx26-x86/sge_execd
4684     1  4684 /usr/sbin/sshd
5502  4684  5502  \_ sshd: dcittaro [priv]
5508  5502  5502      \_ sshd: dcittaro at pts/0
5509  5508  5509          \_ -localshell
5870  5509  5870              \_ rsh ia32 ps -e f -o  
pid,ppid,pgrp,command --co
5873  5870  5870                  \_ rsh ia32 ps -e f -o  
pid,ppid,pgrp,command
4756     1  4756 /usr/sbin/cron
4826     1  4826 /usr/sbin/xinetd -pidfile /var/run/xinetd.pid - 
stayalive -reus
5871  4826  5871  \_ in.rshd
5872  5871  5872      \_ ps -e f -o pid,ppid,pgrp,command --cols=80
<snip>
5839     1  5839 lamd_binary -H 85.239.175.22 -P 51184 -n 5 -o 0 - 
sessionsuffix

[node1.xtal]
   PID  PPID  PGRP COMMAND
<snip>
5305     1  5305 /opt/sge/bin/lx26-x86/sge_execd
6582     1  6582 lamd_binary -H 85.239.175.37 -P 45425 -n 9 -o 0 - 
sessionsuffix
6593     1  6593 lamd_binary -H 85.239.175.22 -P 51184 -n 6 -o 0 - 
sessionsuffix

[node2.xtal]
   PID  PPID  PGRP COMMAND
<snip>
4752     1  4752 /opt/sge/bin/lx26-x86/sge_execd
5753  4752  5753  \_ sge_shepherd-3182 -bg
5912  5753  5912      \_ bash /opt/sge/bioinfo/spool/node2/ 
job_scripts/3182
5913  5912  5912          \_ mpirun C ./mpihello
5815     1  5815 lamd_binary -H 85.239.175.22 -P 51184 -n 0 -o 0 - 
sessionsuffix
5914  5815  5815  \_ ./mpihello
5915  5815  5815  \_ ./mpihello

[node3.xtal]
   PID  PPID  PGRP COMMAND
  <snip>
4764     1  4764 /opt/sge/bin/lx26-x86/sge_execd
5778     1  5778 lamd_binary -H 85.239.175.25 -P 32906 -n 1 -o 0 - 
sessionsuffix

[node4.xtal]
   PID  PPID  PGRP COMMAND
<snip>
4754     1  4754 /opt/sge/bin/lx26-x86/sge_execd
5828     1  5828 lamd_binary -H 85.239.175.37 -P 45425 -n 7 -o 0 - 
sessionsuffix

[node5.xtal]
   PID  PPID  PGRP COMMAND
  <snip>
5184     1  5184 /opt/sge/bin/lx26-x86/sge_execd

[special.xtal]
   PID  PPID  PGRP COMMAND
<snip>
4816     1  4816 /opt/sge/bin/lx26-x86/sge_execd
6029     1  6029 lamd_binary -H 85.239.175.37 -P 45425 -n 8 -o 0 - 
sessionsuffix
6042     1  6042 lamd_binary -H 85.239.175.22 -P 51184 -n 4 -o 0 - 
sessionsuffix


Among other things (no mpihello running on SLAVE nodes) I have no  
lamd on node5.xtal, although qstat says I should have two slots  
filled there...

d


/*
Davide Cittaro
HPC and Bioinformatics Systems @ Informatics Core

IFOM - Istituto FIRC di Oncologia Molecolare
via adamello, 16
20139 Milano
Italy

tel.: +39(02)574303007
e-mail: davide.cittaro at ifom-ieo-campus.it
*/





More information about the gridengine-users mailing list