[GE users] Intel MPI 3.1 / MPICH2 tight integration

reuti reuti at staff.uni-marburg.de
Mon Nov 10 14:11:43 GMT 2008


Hi all,

please find the archive with the mpd integration for MPICH(2) now at  
http://gridengine.sunsource.net/howto/mpich2-integration/mpich2-60.tgz

The Howto page still needs to be updated and doesn't mention the mpd- 
startup as working right now.

As said: I don't know, whether it will fit 1:1 for Intel MPI, as I  
don't use it.

Technical background: the mpdboot can't be used, as it wouldn't allow  
to fork the qrsh to the slave nodes. It needs the communication back  
via stdout before it can startup. So I launch the mpd's directly.  
This is similar to the daemon based smpd startup. All mpd's are still  
attached to the shepherd. I tested it in 6.2 with the new builtin and  
the traditional qrsh. In case of the traditional qrsh, sometimes the  
MPI program isn't killed (but all Python scripts) if you issue a  
qdel, although the additonal group id is clearly attached. I don't  
know why, as a simple kill afterwards works instantly.

Be sure to set "execd_params     ENABLE_ADDGRP_KILL=TRUE", as the  
processes are still jumping out of the process tree and can only be  
killed by the additonal group id.

-- Reuti

------------------------------------------------------
http://gridengine.sunsource.net/ds/viewMessage.do?dsForumId=38&dsMessageId=88384

To unsubscribe from this discussion, e-mail: [users-unsubscribe at gridengine.sunsource.net].



More information about the gridengine-users mailing list