[Mpi3-ft] Ticket 323 - status?

Bronis R. de Supinski bronis at llnl.gov
Thu May 31 11:41:00 CDT 2012


> I have not made myself clear here. The implementation I am talking 
> about supports the entire interface. It just does so without reporting 
> errors and lets MPI in an undefined state after process failures, as is

I understood that was what you meant. While you may not see
providing that implementation as a "quality of implementation"
issue, others would. So many implementors would feel compelled
to provide an implementation that does more. It is the fault-free
overhead of the one that does more that is critical.


> specifically allowed by the draft of #323. This is not a "quality of 
> implementation" issue, as there are very valid and justifiable reasons 
> for an implementation to choose not to support fault tolerance, such as 
> when the target hardware is reliable. However, in such an 
> implementation, an FT application is still portable  and supported, but 
> it will not survive failures (due to lack of support from the MPI 
> layer). Cost on implementors is ridiculous (implementing empty stubs 
> mostly, the most "complex" function is agree, which is a straight remaps 
> to allreduce). The cost in performance is null, zero.

More information about the mpiwg-ft mailing list