[Mpi3-ft] Piggybacking API
Greg Bronevetsky
bronevetsky1 at llnl.gov
Mon Apr 21 11:33:56 CDT 2008
The following papers should clarify the motivation and the usage mechanism:
http://greg.bronevetsky.com/papers/2003PPoPP.pdf
http://sc07.supercomputing.org/schedule/pdf/pap224.pdf
The first describes a checkpointing protocol that was implemented on
top of MPI. The library intercepted all application MPI calls and
added appropriate piggybacking and message logging logic to ensure
that regardless of when each application process decided to take a
checkpoint, it would still be possible to put together a consistent
global state on restart. The user did not need to add any code.
The second paper describes a library called PnMPI that uses the PMPI
interface to make it possible to insert as many PMPI interception
layers as one would like, with the possibility of having them
interact and use each other's functionality. We expect that the
piggybacking interface will primarily be used by tools written at
this level, using tools like PnMPI or just plain PMPI. In particular,
our focus here is on tool builders, not application programmers and
we believe that APIs like this will be useful to all application
developers simply because they're important to tool builders whose
tools may be used by any developer.
In addition to checkpointing, piggybacking is useful for tracing
libraries, performance analysis tools and generic libraries that need
to lazily propagate information and would prefer not to introduce any
new communication for performance reasons.
Greg Bronevetsky
Post-Doctoral Researcher
1028 Building 451
Lawrence Livermore National Lab
(925) 424-5756
bronevetsky1 at llnl.gov
At 06:36 AM 4/21/2008, Terry Dontje wrote:
>So I reread the piggybacking document on wiki. I am not thrilled with
>the amount of new APIs this would be adding to the standard but can also
>see the point of the paper. I am curious how the new API is expected to
>be used? The proposal say's this API is needed for user-level fault
>tolerance solutions. So do we expect a user to change all application
>calls to the MPI library to use the PB calls? I wonder if a more
>general solution that doesn't require a direct change to the API would work.
>
>I wonder if there might be a way one could register piggybacking with a
>communicator and somehow have the actual piggybacking occur as a
>callback from an implementations messaging layer?
>
>Just a thought,
>
>--td
>_______________________________________________
>mpi3-ft mailing list
>mpi3-ft at lists.mpi-forum.org
>http://lists.mpi-forum.org/mailman/listinfo.cgi/mpi3-ft
More information about the mpiwg-ft
mailing list