[Mpi3-ft] Communicator Virtualization as a step forward
Greg Bronevetsky
bronevetsky1 at llnl.gov
Fri Feb 13 10:50:55 CST 2009
>FT must be optional to be accepted. One way is to put it into a
>subset, say, thru the MPI_INIT_ASSERTED proposal. Another is to add
>yet one MPI_INIT call that will contain a flag for FT configuration
>(like MPI_FT_SYNCHRONOUS, MPI_FT_ASYNCHRONOUS, etc.). This was
>mentioned in relation to the issue notification in the earlier fault
>handling=error handling proposal of mine, see https://
>svn.mpi-forum.org/trac/mpi-forum-web/wiki/Fault%20Handling.
I agree that FT must be optional but I don't think that we need to
add anything to the proposal to make this happen. The proposal
provides an API that allows the MPI implementation to tell the
application about detected but recoverable failures and help it
perform recovery. It does not say anything about which failures must
be recoverable for MPI. Reliable MPI implementations will be able to
do much more than unreliable MPI implementations. Users who need
reliability will choose the former while others will choose the
latter. The same will apply for things like network degradation.
Since the spec will never talk about what types of physical events
must be reportable by MPI, individual implementations will be able to
trade off efficiency against the usefulness of system monitoring and
all such choices will be compliant to the spec.
Greg Bronevetsky
Post-Doctoral Researcher
1028 Building 451
Lawrence Livermore National Lab
(925) 424-5756
bronevetsky1 at llnl.gov
More information about the mpiwg-ft
mailing list