[GE users] suspension under MPICH2 tight integration

Jason Crane jasonc at mrsc.ucsf.edu
Tue May 16 21:45:45 BST 2006


Hi,

I'm writing in regard to the "Tight Integration of the MPICH2 library
into SGE" doc (http://gridengine.sunsource.net/howto/mpich2-
integration/mpich2-integration.html).  I have 2 questions.  

1. The MPICH2 user's guide documentation indicates that it is possible
to suspend and continue MPICH2 jobs, at least under mpd process
management (nothing explicit about smpd).  However, in a previous post
it was mentioned that MPI suspend isn't supported for slave tasks under
SGE because of timing problems:
(http://gridengine.sunsource.net/servlets/ReadMsglistName=users&msgNo=15354)
If standalone MPICH2 suspension is supported, then is the "timing
problem" introduced by the integration with SGE, or perhaps it's related
to using smpd?  Is there anything I need to worry about if I attempt to
implement a custom suspend/resume method for suspending slave tasks
under tight integration with the MPICH2 smpd daemonless parallel
environment?

2. The MPICH2 tight integration document indicates that job accounting
is handled accurately under the smpd daemon-based smpd startup method.
Is it also handled correctly under the daemonless method?

Thanks,
Jason


---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
For additional commands, e-mail: users-help at gridengine.sunsource.net




More information about the gridengine-users mailing list