No subject


Wed Jan 12 20:38:46 GMT 2011


I would strongly recommend that you use this.

Despite Reuti's valiant efforts, getting LAM/MPI working nicely is
incredibly annoying - even if you don't have multiple types of software
that use slightly different LAM/MPI installations, and even if your
LAM/MPI version is ostensibly GridEngine-aware.

We have both HP-MPICH (STAR-CD) and openmpi (OpenFOAM) in use and both
work fine with the GridEngine. LAM/MPI was never anything but annoying
to work with, and I'm glad not to touch it anymore.

For HP-MPICH, our pe looks like this:
pe_name           mpich
slots             999
user_lists        NONE
xuser_lists       NONE
start_proc_args   /opt/n1ge6/mpi/startmpi.sh -catch_rsh $pe_hostfile
stop_proc_args    /opt/n1ge6/mpi/stopmpi.sh
allocation_rule   $fill_up
control_slaves    TRUE
job_is_first_task FALSE
urgency_slots     min

I prefer not to rely on the rsh wrapper that linked in by the -catch_rsh
mechanism will actually get seen first in the PATH. The STAR-CD scripts,
for example, have their own rsh wrapper that is also supposed to get
seen first. 

To be certain that qrsh is actually used by the MPICH transport, we
export these explicitly in the job-scripts:
# hp-mpi
MPI_REMSH=$SGE_ROOT/mpi/rsh; export MPI_REMSH
# mpich
P4_RSHCOMMAND=$SGE_ROOT/mpi/rsh; export P4_RSHCOMMAND



And in the GridEngine configuration we have
  rsh_daemon                   /usr/sbin/sshd -i
  rsh_command                  /usr/bin/ssh
  rlogin_command               /usr/bin/ssh


Even if you don't care about correct accounting, the tight integration
via qrsh is important if you want to avoid leaving zombie processes when
you use qdel to kill parallel jobs.

/mark
This e-mail message and any attachments may contain 
legally privileged, confidential or proprietary Information, 
or information otherwise protected by law of EMCON 
Technologies, its affiliates, or third parties. This notice 
serves as marking of its "Confidential" status as defined 
in any confidentiality agreements concerning the sender 
and recipient. If you are not the intended recipient(s), 
or the employee or agent responsible for delivery of this 
message to the intended recipient(s), you are hereby 
notified that any dissemination, distribution or copying 
of this e-mail message is strictly prohibited. 
If you have received this message in error, please 
immediately notify the sender and delete this e-mail 
message from your computer.


---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
For additional commands, e-mail: users-help at gridengine.sunsource.net




More information about the gridengine-users mailing list