[Mpi3-ft] docs on CP/R with RDMA fabrics
jjhursey at open-mpi.org
Fri Jun 6 09:21:45 CDT 2008
I was not on the last teleconf so this might have been covered there,
but just to make sure I understand the later two proposals you sent.
You are not proposing any MPI interfaces for checkpoint/restart, but
just describing how you implemented a transparent solution inside or
below an MPI implementation. Is that correct?
On Jun 5, 2008, at 5:44 PM, Mike Heffner wrote:
> [3rd try around, mailing list bounce]
> Attached are the documents I promised during the previous conference
> The IPDPS 2007 paper gives a technical description of the
> Availability Services product (known academically as "DejaVu")
> including the online logging algorithm used for BSD socket
> applications. It also includes some performance numbers from
> previous experiments with our RDMA MVAPICH implementation.
> The avs_mpi_integration.pdf document provides a brief description of
> the interfaces that were required to integrate a userlevel CP/R
> framework with MVAPICH/RDMA. This provides an insight into what we
> required to integrate with a real-world RDMA MPI stack to provide CP/
> R with very little overhead.
> The third document is one I wrote this afternoon to propose an
> asynchronous, quiescence interface. It is similar to Joshua's
> proposal on the wiki, but provides an asynchronous, driver-level
> version that our solution requires for application transparency.
> I will try to get some of these onto the wiki as well.
> Mike Heffner <mike.heffner at evergrid.com>
> EverGrid Software
> Blacksburg, VA USA
> Voice: (540) 443-3500 x603
> mpi3-ft mailing list
> mpi3-ft at lists.mpi-forum.org
More information about the mpiwg-ft