[GE users] lam/mpi sge and msc nastran integration

Reuti reuti at staff.uni-marburg.de
Tue Oct 21 22:33:49 BST 2008


Am 21.10.2008 um 18:14 schrieb Olesen, Mark:

>> The problem might be: which version of LAM/MPI was used to compile
>> Nastran? Only since 7.1.1 it's SGE aware. Maybe MSC used an older
>> version.
>
> According to your original post, you're using nastran2007 on Linux.
> From the MSC release notes, HP MPI 2.2.5 is the default mpi.
> I would strongly recommend that you use this.

Yes, it's working fine and since 2.2.5 it also accepts a hostfile.  
Before I always had to assemble this application file. Just to note,  
that we are using it with rsh and don't redirect in SGE anything to  
use ssh.

-- Reuti


> Despite Reuti's valiant efforts, getting LAM/MPI working nicely is
> incredibly annoying - even if you don't have multiple types of  
> software
> that use slightly different LAM/MPI installations, and even if your
> LAM/MPI version is ostensibly GridEngine-aware.
>
> We have both HP-MPICH (STAR-CD) and openmpi (OpenFOAM) in use and both
> work fine with the GridEngine. LAM/MPI was never anything but annoying
> to work with, and I'm glad not to touch it anymore.
>
> For HP-MPICH, our pe looks like this:
> pe_name           mpich
> slots             999
> user_lists        NONE
> xuser_lists       NONE
> start_proc_args   /opt/n1ge6/mpi/startmpi.sh -catch_rsh $pe_hostfile
> stop_proc_args    /opt/n1ge6/mpi/stopmpi.sh
> allocation_rule   $fill_up
> control_slaves    TRUE
> job_is_first_task FALSE
> urgency_slots     min
>
> I prefer not to rely on the rsh wrapper that linked in by the - 
> catch_rsh
> mechanism will actually get seen first in the PATH. The STAR-CD  
> scripts,
> for example, have their own rsh wrapper that is also supposed to get
> seen first.
>
> To be certain that qrsh is actually used by the MPICH transport, we
> export these explicitly in the job-scripts:
> # hp-mpi
> MPI_REMSH=$SGE_ROOT/mpi/rsh; export MPI_REMSH
> # mpich
> P4_RSHCOMMAND=$SGE_ROOT/mpi/rsh; export P4_RSHCOMMAND
>
>
>
> And in the GridEngine configuration we have
>   rsh_daemon                   /usr/sbin/sshd -i
>   rsh_command                  /usr/bin/ssh
>   rlogin_command               /usr/bin/ssh
>
>
> Even if you don't care about correct accounting, the tight integration
> via qrsh is important if you want to avoid leaving zombie processes  
> when
> you use qdel to kill parallel jobs.
>
> /mark
> This e-mail message and any attachments may contain
> legally privileged, confidential or proprietary Information,
> or information otherwise protected by law of EMCON
> Technologies, its affiliates, or third parties. This notice
> serves as marking of its "Confidential" status as defined
> in any confidentiality agreements concerning the sender
> and recipient. If you are not the intended recipient(s),
> or the employee or agent responsible for delivery of this
> message to the intended recipient(s), you are hereby
> notified that any dissemination, distribution or copying
> of this e-mail message is strictly prohibited.
> If you have received this message in error, please
> immediately notify the sender and delete this e-mail
> message from your computer.
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
> For additional commands, e-mail: users-help at gridengine.sunsource.net
>
>


---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
For additional commands, e-mail: users-help at gridengine.sunsource.net




More information about the gridengine-users mailing list