[Mpi3-ft] system-level C/R requirements

Mike Heffner mike.heffner at librato.com
Fri Oct 24 16:24:58 CDT 2008

My original thoughts on this were to not put it at the 
application-visible level of the MPI specification. There are going to 
be several different approaches to checkpointing an MPI stack and I 
don't think they can all be represented in a single flat API. There will 
be solutions that sit at the application MPI level and do message 
logging or provide the ability for application quiescence, there will be 
solutions that can sit lower in the stack and not worry about 
communicators, MPI msg types, etc. just the final send/recvs (e.g., the 
MPID layer of mpich), and there will probably be solutions that are 
provided as part of the interconnect implementations themselves (e.g., 
as part of libverbs for IB).

Therefore, it may be appropriate to discuss these solutions as tiered 
approaches to C/R and not all as MPI-level API requirements. I do think 
that this group should be influential in designing these requirements 
though even if they don't exist at the application's MPI level.

Narasimhan, Kannan wrote:
> [Changing the title to track discussion]
> Agreed -- specifiying an explicit list of platforms or OS or even
> resource specifics is not the way to go in a standard.
> My suggestion would be to explore if we can define abstract,
> higher-level resources to define a "state", and specify high-level
> actions. For instance, pinning/unpinning memory is very specific to
> RDMA, but maybe a "disconnect virtual connection" operation may
> abstract it. But this puts us into the realm of virtualizing MPI
> internal components/concepts ..
> Maybe there is a more elegant way ...
> -Kannan-
> -----Original Message----- From: mpi3-ft-bounces at lists.mpi-forum.org
> [mailto:mpi3-ft-bounces at lists.mpi-forum.org] On Behalf Of Greg
> Bronevetsky Sent: Friday, October 24, 2008 12:46 PM To: MPI 3.0 Fault
> Tolerance and Dynamic Process Control working Group; MPI 3.0 Fault
> Tolerance and Dynamic Process Control working Group Subject: Re:
> [Mpi3-ft] Summary of today's meeting
>> When I look at the restore requirements on MPI as described below,
>> they seem quit extensive. Including re-pining and opening any
>> previous opened communication handles.
> I agree. Furthermore, its not just the length of the list but also
> the fact that it is very sensitive to platform-specific details. If
> we have any hope of providing this API, we'll need a good survey of
> what would be required on the full range of possible target
> platforms, ranging from BGL CNK/Catamount, to Windows/Unix to
> Symbian, even if there currently are no MPI implementations on a
> given possible platform. With the regular MPI spec we don't need to
> do anything so thorough because MPI is high-level enough that it is
> reasonable to assume that an implementation of some sort can be
> written for any platform. However, here we're talking about such
> low-level details that getting them wrong in the spec would mean that
> implementations on some platforms would actually be impossible. We
> could put in an explicit list of OSs that the quiescence API applies
> to but I don't think that'd fly with the forum.
> Greg Bronevetsky Post-Doctoral Researcher 1028 Building 451 Lawrence
> Livermore National Lab (925) 424-5756 bronevetsky1 at llnl.gov
> _______________________________________________ mpi3-ft mailing list 
> mpi3-ft at lists.mpi-forum.org 
> http://lists.mpi-forum.org/mailman/listinfo.cgi/mpi3-ft
> _______________________________________________ mpi3-ft mailing list 
> mpi3-ft at lists.mpi-forum.org 
> http://lists.mpi-forum.org/mailman/listinfo.cgi/mpi3-ft


   Mike Heffner <mike.heffner at evergrid.com>
   Librato, Inc.
   Blacksburg, VA USA

   Voice: (540) 443-3500 #603

More information about the mpiwg-ft mailing list