[Mpi3-ft] Mission statement and assumptions
Graham, Richard L.
rlgraham at ornl.gov
Mon Aug 31 10:13:39 CDT 2009
Here is the summary of what we agreed to on the last call:
Mission Statement:
Ensure the survivability of the functionality of the MPI library in the face of failures and help the programmer build reliable applications by providing additional hooks needed.
Assumptions:
- Users can specify if they want the MPI library to support MPI fault-tolerance
- MPI should restore it's internal state to a consistent state and report any problems if there is a fault.
- This allows for fast unreliable MPI on reliable hardware.
- MPI fault tolerance features do not handle user data
- MPI 3.0 should work the same as MPI 2.2, if no fault-tolerance features are used. (backwards compatible)
- MPI should have minimal impact on application performance when no failures occur.
- MPI should handle transient and fail-stop faults. We will not handle Byzantine faults
Comments ? This will be the basis for some of the discussions in Helsinki this week.
Rich
More information about the mpiwg-ft
mailing list