[Mpi3-ft] Communicator Virtualization as a step forward
Greg Bronevetsky
bronevetsky1 at llnl.gov
Wed Feb 18 10:29:04 CST 2009
>Thanks. How will you let the MPI know the checkpoint is coming, to
>give it a fair chance to prepare to this and then recover after the
>checkpoint? This is akin to the MPI_Finalize/MPI_Init in some sense,
>midway thru the job, hence the analogy.
Just use the checkpointer-specific call. The call is going to have
checkpointer-specific semantics, so why not give it a
checkpointer-specific name? I understand that there is some use to
allowing applications to use the same name across all checkpointers
but the bar should be higher than that for adding something to the
standard. Also, right now the whole approach inherently only supports
one checkpointing protocol: synch-and-stop. If we can work out a more
generic API that supports other protocols I think that it may have
enough value to be included in the spec. Right now it still hasn't
passed the bar.
Greg Bronevetsky
Post-Doctoral Researcher
1028 Building 451
Lawrence Livermore National Lab
(925) 424-5756
bronevetsky1 at llnl.gov
http://greg.bronevetsky.com
More information about the mpiwg-ft
mailing list