[Mpi3-ft] Summary of today's meeting

Thu Oct 23 15:41:42 CDT 2008

>Even in the application-directed case, most MPI stacks would still 
>need an API call to inform them to "park" their state in a manner 
>that it can be correctly restarted. This might mean recording cached 
>memory registrations (not the actual memory regions, just the 
>handles associated with them), recording the open communicators and 
>recording the communication channels open amongst processes. Some 
>MPI stacks may find it more efficient to collate this state 
>information only at checkpoint time vs. maintaining it throughout 
>job execution.
>
>I would agree though that the message quiescence is more powerful 
>for the asynchronous and/or transparent checkpointing cases.

One thing that bothers me is that such functionality can be easily 
abused. I can easily imagine MPI providing an API for quiescence at 
an MPI_Barrier. However, given that API I also easily see a user 
expecting it to work when processes are more loosely synchronized and 
not understanding why this fails in weird and creative ways. However, 
support for such more elaborate strategies would be fairly hard to 
put into MPI itself. The same, it seems, is true of the low-level 
quiscence scenario: it only works with the sync-and-stop 
checkpointing protocol and doesn't work with anything else. This is 
fine but it needs to be very clearly explained to potential users.

Also, for application-level quiescence we'll need the application to 
give MPI a handle to which to save its quiesced state. This means 
that the FT API is now trending towards an actual checkpointing 
infrastructure built into MPI. We should tred carefully here since 
this is exactly what we said we didn't want to do when the working 
group got together.

Greg Bronevetsky
Post-Doctoral Researcher
1028 Building 451
Lawrence Livermore National Lab
(925) 424-5756
bronevetsky1 at llnl.gov