[Mpi3-ft] Piggybacking API

Greg Bronevetsky bronevetsky1 at llnl.gov
Mon Apr 21 11:33:56 CDT 2008


The following papers should clarify the motivation and the usage mechanism:
http://greg.bronevetsky.com/papers/2003PPoPP.pdf
http://sc07.supercomputing.org/schedule/pdf/pap224.pdf

The first describes a checkpointing protocol that was implemented on 
top of MPI. The library intercepted all application MPI calls and 
added appropriate piggybacking and message logging logic to ensure 
that regardless of when each application process decided to take a 
checkpoint, it would still be possible to put together a consistent 
global state on restart. The user did not need to add any code.

The second paper describes a library called PnMPI that uses the PMPI 
interface to make it possible to insert as many PMPI interception 
layers as one would like, with the possibility of having them 
interact and use each other's functionality. We expect that the 
piggybacking interface will primarily be used by tools written at 
this level, using tools like PnMPI or just plain PMPI. In particular, 
our focus here is on tool builders, not application programmers and 
we believe that APIs like this will be useful to all application 
developers simply because they're important to tool builders whose 
tools may be used by any developer.

In addition to checkpointing, piggybacking is useful for tracing 
libraries, performance analysis tools and generic libraries that need 
to lazily propagate information and would prefer not to introduce any 
new communication for performance reasons.

Greg Bronevetsky
Post-Doctoral Researcher
1028 Building 451
Lawrence Livermore National Lab
(925) 424-5756
bronevetsky1 at llnl.gov


At 06:36 AM 4/21/2008, Terry Dontje wrote:
>So I reread the piggybacking document on wiki.  I am not thrilled with
>the amount of new APIs this would be adding to the standard but can also
>see the point of the paper.  I am curious how the new API is expected to
>be used?  The proposal say's this API is needed for user-level fault
>tolerance solutions.  So do we expect a user to change all application
>calls to the MPI library to use the PB calls?  I wonder if a more
>general solution that doesn't require a direct change to the API would work.
>
>I wonder if there might be a way one could register piggybacking with a
>communicator and somehow have the actual piggybacking occur as a
>callback from an implementations messaging layer?
>
>Just a thought,
>
>--td
>_______________________________________________
>mpi3-ft mailing list
>mpi3-ft at lists.mpi-forum.org
>http://lists.mpi-forum.org/mailman/listinfo.cgi/mpi3-ft




More information about the mpiwg-ft mailing list