[Mpi3-ft] Mission statement and assumptions

Graham, Richard L. rlgraham at ornl.gov
Mon Aug 31 10:13:39 CDT 2009


Here is the summary of what we agreed to on the last call:

Mission Statement:
  Ensure the survivability of the functionality of the MPI library in the face of failures and help the programmer build reliable applications by providing additional hooks needed.

Assumptions:
  - Users can specify if they want the MPI library to support MPI fault-tolerance
 - MPI should restore it's internal state to a consistent state and report any problems if there is a fault.
    - This allows for fast unreliable MPI on reliable hardware.
    - MPI fault tolerance features do not handle user data
 - MPI 3.0 should work the same as MPI 2.2, if no fault-tolerance features are used. (backwards compatible)
 - MPI should have minimal impact on application performance when no failures occur.
 - MPI should handle transient and fail-stop faults.  We will not handle Byzantine faults


Comments ?    This will be the basis for some of the discussions in Helsinki this week.

Rich




More information about the mpiwg-ft mailing list