[mpi3-ft] FW: Starting slides for 2/1/2008 telecon

Greg Bronevetsky bronevetsky1 at llnl.gov
Wed Jan 30 02:17:11 CST 2008

Two comments:

We may want to add the capability to spawn new processes and give 
them the ranks of the failed processes. This is more efficient than 
pre-allocating enough spare processes as part of the original job 
allocation, so it might be a good idea to include in the spec.

I disagree with the comments about MPI quieting the communication 
system because this presumes that the application will use the 
trivial sync-and-stop CPR protocol. They may the case but we 
shouldn't write this assumption into the spec. We should probably 
restrict ourselves to only saying that no message may get partially 
delivered since such messages would be very hard to deal with above 
the MPI library.

Greg Bronevetsky
Post-Doctoral Researcher
1028 Building 451
Lawrence Livermore National Lab
(925) 424-5756
bronevetsky1 at llnl.gov

At 10:12 AM 1/29/2008, Richard Graham wrote:
>This did not seem to make it through the first time, so let me try again.
>------ Forwarded Message
>From: Richard Graham <rlgraham at ornl.gov>
>Date: Tue, 29 Jan 2008 10:55:11 -0500
>To: Discussion of MPI 3 Fault Tolerance Support <mpi3-ft at cs.uiuc.edu>
>Conversation: Starting slides for 2/1/2008 telecon
>Subject: Starting slides for 2/1/2008 telecon
>Attached is a set of slides I intend to use as a staring point for the
>telecon this coming Friday.  If you are planning on attending, please take a
>look at these, and see what is missing.  The main goal for this call is to
>help set the scope of the problem for which we intend to propose changes to
>the MPI standard.
>------ End of Forwarded Message
>mpi3-ft mailing list
>mpi3-ft at cs.uiuc.edu

More information about the mpiwg-ft mailing list