[GE users] Mvapich processes not killed on qdel

Brian R. Smith brs at usf.edu
Thu May 10 15:40:35 BST 2007


    [ The following text is in the "ISO-8859-1" character set. ]
    [ Your display is set for the "ISO-8859-10" character set.  ]
    [ Some special characters may be displayed incorrectly. ]

Reuti & Mike,

I dealt with mvapich and SGE-tight integration (and hence abandoned 
mvapich in favor of OpenMPI, which has wonderful SGE integration support 
and a much cooler plugin-based framework, IMHO).  The mpirun_rsh command 
is actually a piece of C code where path values for rsh and ssh are 
hard-coded into the program.  Because this code seems to blow away at 
least the PATH variable during execution, the exec() call to just "rsh" 
will fail since no paths will be defined (and hence attempts to set PATH 
in $SGE_ROOT/mpi/startmpi.sh will fail).  There was a patch floating 
around for a previous beta release, but it will not apply to the the 
current release cleanly.  The file in question in version 0.9.8 is

mpid/ch_gen2/process/mpirun_rsh.c

Beginning on line 130, I believe, you will see

#define RSH_CMD "/usr/bin/rsh"
#define SSH_CMD "/usr/bin/ssh"

I looked around for fixes (as I said, you cannot just change these to 
"rsh" or "ssh", it will fail) but as of a couple weeks ago, no one seems 
to have resolved this.  I hope this helps.

-Brian



Reuti wrote:
> Hi,
>
> Am 09.05.2007 um 21:53 schrieb Mike Hanby:
>
>> I created a simple helloworld job that prints a message and then sleeps
>> for 5 minutes. If I qdel the job after 1 minute, the job is removed from
>> the queue but remains running on the nodes for 4 more minutes. I'm using
>> rsh in this example I have the ps info below:
>
> but still the processes are not children of the 
> sge_execd/sge_shepherd. So the rsh-wrapper isn't used. Is the path to 
> the rsh binary hardcoded somewhere in your MPI scripts? There is 
> /usr/bin/rsh mentioned - can you change it somewhere to read just rsh, 
> so that the rsh-wrapper will be accessed instead of the binary?
>
> -- Reuti
>
>
>> I submitted the job using the following job script:
>> #!/bin/bash
>> #$ -S /bin/bash
>> #$ -cwd
>> #$ -N TestMVAPICH
>> #$ -pe mvapich 4
>> #$ -v MPIR_HOME=/usr/local/topspin/mpi/mpich
>> #$ -v MPICH_PROCESS_GROUP=no
>> #$ -V
>> export MPI_HOME=/usr/local/topspin/mpi/mpich
>> export
>> LD_LIBRARY_PATH=/usr/local/topspin/lib64:$MPI_HOME/lib64:$LD_LIBRARY_PAT
>> H
>> export PATH=$TMPDIR:$MPI_HOME/bin:$PATH
>> MPIRUN=${MPI_HOME}/bin/mpirun_rsh
>> $MPIRUN -rsh -np $NSLOTS -machinefile $TMPDIR/machines ./hello-mvapich
>>
>> This is the ps output on the node while the job is running in the queue:
>> $ ssh compute-0-7 "ps -e f -o pid,ppid,pgrp,command|grep myuser|grep -v
>> grep"
>>  1460  3611  1460  \_ sshd: myuser [priv]
>>  1464  1460  1460      \_ sshd: myuser at notty
>>   951   947   951  |   \_ bash -c cd /home/myuser/pmemdTest-mvapich;
>> /usr/bin/env MPIRUN_MPD=0 MPIRUN_HOST=compute-0-7.local
>> MPIRUN_PORT=32826
>> MPIRUN_PROCESSES='compute-0-7:compute-0-7:compute-0-7:compute-0-7:'
>> MPIRUN_RANK=0 MPIRUN_NPROCS=4 MPIRUN_ID=942      ./hello-mvapich
>>   954   948   954  |   \_ bash -c cd /home/myuser/pmemdTest-mvapich;
>> /usr/bin/env MPIRUN_MPD=0 MPIRUN_HOST=compute-0-7.local
>> MPIRUN_PORT=32826
>> MPIRUN_PROCESSES='compute-0-7:compute-0-7:compute-0-7:compute-0-7:'
>> MPIRUN_RANK=1 MPIRUN_NPROCS=4 MPIRUN_ID=942      ./hello-mvapich
>>   955   949   955  |   \_ bash -c cd /home/myuser/pmemdTest-mvapich;
>> /usr/bin/env MPIRUN_MPD=0 MPIRUN_HOST=compute-0-7.local
>> MPIRUN_PORT=32826
>> MPIRUN_PROCESSES='compute-0-7:compute-0-7:compute-0-7:compute-0-7:'
>> MPIRUN_RANK=2 MPIRUN_NPROCS=4 MPIRUN_ID=942      ./hello-mvapich
>>   966   950   966      \_ bash -c cd /home/myuser/pmemdTest-mvapich;
>> /usr/bin/env MPIRUN_MPD=0 MPIRUN_HOST=compute-0-7.local
>> MPIRUN_PORT=32826
>> MPIRUN_PROCESSES='compute-0-7:compute-0-7:compute-0-7:compute-0-7:'
>> MPIRUN_RANK=3 MPIRUN_NPROCS=4 MPIRUN_ID=942      ./hello-mvapich
>>   943   942   938              \_ /usr/bin/rsh compute-0-7 cd
>> /home/myuser/pmemdTest-mvapich; /usr/bin/env MPIRUN_MPD=0
>> MPIRUN_HOST=compute-0-7.local MPIRUN_PORT=32826
>> MPIRUN_PROCESSES='compute-0-7:compute-0-7:compute-0-7:compute-0-7:'
>> MPIRUN_RANK=0 MPIRUN_NPROCS=4 MPIRUN_ID=942      ./hello-mvapich
>>   944   942   938              \_ /usr/bin/rsh compute-0-7 cd
>> /home/myuser/pmemdTest-mvapich; /usr/bin/env MPIRUN_MPD=0
>> MPIRUN_HOST=compute-0-7.local MPIRUN_PORT=32826
>> MPIRUN_PROCESSES='compute-0-7:compute-0-7:compute-0-7:compute-0-7:'
>> MPIRUN_RANK=1 MPIRUN_NPROCS=4 MPIRUN_ID=942      ./hello-mvapich
>>   945   942   938              \_ /usr/bin/rsh compute-0-7 cd
>> /home/myuser/pmemdTest-mvapich; /usr/bin/env MPIRUN_MPD=0
>> MPIRUN_HOST=compute-0-7.local MPIRUN_PORT=32826
>> MPIRUN_PROCESSES='compute-0-7:compute-0-7:compute-0-7:compute-0-7:'
>> MPIRUN_RANK=2 MPIRUN_NPROCS=4 MPIRUN_ID=942      ./hello-mvapich
>>   946   942   938              \_ /usr/bin/rsh compute-0-7 cd
>> /home/myuser/pmemdTest-mvapich; /usr/bin/env MPIRUN_MPD=0
>> MPIRUN_HOST=compute-0-7.local MPIRUN_PORT=32826
>> MPIRUN_PROCESSES='compute-0-7:compute-0-7:compute-0-7:compute-0-7:'
>> MPIRUN_RANK=3 MPIRUN_NPROCS=4 MPIRUN_ID=942      ./hello-mvapich
>>
>> And the ps after I qdel the job
>> $ ssh compute-0-7 "ps -e f -o pid,ppid,pgrp,command|grep myuser|grep -v
>> grep"
>>  1735  3611  1735  \_ sshd: myuser [priv]
>>  1739  1735  1735      \_ sshd: myuser at notty
>>   951   947   951  |   \_ bash -c cd /home/myuser/pmemdTest-mvapich;
>> /usr/bin/env MPIRUN_MPD=0 MPIRUN_HOST=compute-0-7.local
>> MPIRUN_PORT=32826
>> MPIRUN_PROCESSES='compute-0-7:compute-0-7:compute-0-7:compute-0-7:'
>> MPIRUN_RANK=0 MPIRUN_NPROCS=4 MPIRUN_ID=942      ./hello-mvapich
>>   954   948   954  |   \_ bash -c cd /home/myuser/pmemdTest-mvapich;
>> /usr/bin/env MPIRUN_MPD=0 MPIRUN_HOST=compute-0-7.local
>> MPIRUN_PORT=32826
>> MPIRUN_PROCESSES='compute-0-7:compute-0-7:compute-0-7:compute-0-7:'
>> MPIRUN_RANK=1 MPIRUN_NPROCS=4 MPIRUN_ID=942      ./hello-mvapich
>>   955   949   955  |   \_ bash -c cd /home/myuser/pmemdTest-mvapich;
>> /usr/bin/env MPIRUN_MPD=0 MPIRUN_HOST=compute-0-7.local
>> MPIRUN_PORT=32826
>> MPIRUN_PROCESSES='compute-0-7:compute-0-7:compute-0-7:compute-0-7:'
>> MPIRUN_RANK=2 MPIRUN_NPROCS=4 MPIRUN_ID=942      ./hello-mvapich
>>   966   950   966      \_ bash -c cd /home/myuser/pmemdTest-mvapich;
>> /usr/bin/env MPIRUN_MPD=0 MPIRUN_HOST=compute-0-7.local
>> MPIRUN_PORT=32826
>> MPIRUN_PROCESSES='compute-0-7:compute-0-7:compute-0-7:compute-0-7:'
>> MPIRUN_RANK=3 MPIRUN_NPROCS=4 MPIRUN_ID=942      ./hello-mvapich
>>
>> -----Original Message-----
>> From: Mike Hanby [mailto:mhanby at uab.edu]
>> Sent: Wednesday, May 09, 2007 11:59
>> To: users at gridengine.sunsource.net
>> Subject: RE: [GE users] Mvapich processes not killed on qdel
>>
>> Hmm, I changed the mpirun command to mpirun_rsh -rsh and submitted the
>> job, it started and failed with a bunch of connections refused. By
>> default Rocks disables RSH.
>>
>> Does tight integration only work with rsh? If so, I'll see if I can get
>> that enabled and try again.
>>
>> -----Original Message-----
>> From: Reuti [mailto:reuti at staff.uni-marburg.de]
>> Sent: Wednesday, May 09, 2007 11:27
>> To: users at gridengine.sunsource.net
>> Subject: Re: [GE users] Mvapich processes not killed on qdel
>>
>> Hi,
>>
>> can you please post the processtree (master and slave) of a running
>> job on a node by using the ps command:
>>
>> ps -e f -o pid,ppid,pgrp,command
>>
>> Are you sure, that the SGE rsh-wrapper is used, as you mentioned
>> mpirun_ssh?
>>
>> -- Reuti
>>
>>
>> Am 09.05.2007 um 17:43 schrieb Mike Hanby:
>>
>>> Howdy,
>>>
>>> I have GE 6.0u8 on a Rocks 4.2.1 cluster with Infiniband and the
>>> Topspin roll (which includes mvapich).
>>>
>>>
>>>
>>> When I qdel an mvapich job, the job immediately is removed from the
>>> queue, however most of the processes on the nodes do not get
>>> killed. It appears that the mpirun_ssh process does get killed,
>>> however all of the actual job executables (sander.MPI) doesn't.
>>>
>>>
>>>
>>> I followed the directions for tight integration of Mvapich
>>>
>>> http://gridengine.sunsource.net/project/gridengine/howto/mvapich/
>>> MVAPICH_Integration.html
>>>
>>>
>>>
>>> The job runs fine, but again it doesn't kill off processes when
>>> qdel'd.
>>>
>>>
>>>
>>> Here's the pe:
>>>
>>> $ qconf -sp mvapich
>>>
>>> pe_name           mvapich
>>>
>>> slots             9999
>>>
>>> user_lists        NONE
>>>
>>> xuser_lists       NONE
>>>
>>> start_proc_args   /share/apps/gridengine/mvapich/startmpi.sh -
>>> catch_rsh \
>>>
>>>                   $pe_hostfile
>>>
>>> stop_proc_args    /share/apps/gridengine/mvapich/stopmpi.sh
>>>
>>> allocation_rule   $round_robin
>>>
>>> control_slaves    TRUE
>>>
>>> job_is_first_task FALSE
>>>
>>> urgency_slots     min
>>>
>>>
>>>
>>> The only modifications made to the startmpi.sh script was to change
>>> the location of the hostname and rsh scripts from $SGE_ROOT to /
>>> share/apps/gridengine/mvapich
>>>
>>>
>>>
>>> Any suggestions on what I should look for?
>>>
>>>
>>>
>>> Thanks, MIke
>>>
>>>
>>>
>>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
>> For additional commands, e-mail: users-help at gridengine.sunsource.net
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
>> For additional commands, e-mail: users-help at gridengine.sunsource.net
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
>> For additional commands, e-mail: users-help at gridengine.sunsource.net
>>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
> For additional commands, e-mail: users-help at gridengine.sunsource.net


-- 
--------------------------------------------------------
+ Brian R. Smith                                       +
+ HPC Systems Analyst & Programmer                     +
+ Research Computing, University of South Florida      +
+ 4202 E. Fowler Ave. LIB618                           +
+ Office Phone: 1 (813) 974-1467                       +
+ Mobile Phone: 1 (813) 230-3441                       +
+ Organization URL: http://rc.usf.edu                  +
--------------------------------------------------------

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
For additional commands, e-mail: users-help at gridengine.sunsource.net




More information about the gridengine-users mailing list