[Mpi3-ft] MPI Fault Tolerance scenarios

Greg Bronevetsky bronevetsky1 at llnl.gov
Thu Feb 26 13:55:20 CST 2009


At 09:42 AM 2/26/2009, Erez Haba wrote:
>I might have missed it, but I don't recall saying that we'll use 
>MPI_Comm_spawn to restart a process. As I recall there are two 
>flavors for MPI_Comm_repair one that restart a process and the other 
>that fixes the communicator by leaving a hole. Rich, for now I'd 
>like us to be explicit about restart vs fixing (rather than using a 
>policy). So I suggest that we call the API's: MPI_Comm_restart_rank 
>and MPI_Comm_remove_rank (and MPI_Comm_Irestart_rank, no need for 
>Iremove as it's a local operation) I'll change the sample code to 
>reflect that. Thoughts? Thanks, .Erez

Since right now we're being very explicit about the API, having two 
calls makes sense.

Greg Bronevetsky
Post-Doctoral Researcher
1028 Building 451
Lawrence Livermore National Lab
(925) 424-5756
bronevetsky1 at llnl.gov
http://greg.bronevetsky.com 




More information about the mpiwg-ft mailing list