[Mpi3-ft] MPI Fault Tolerance scenarios
Greg Bronevetsky
bronevetsky1 at llnl.gov
Tue Mar 3 12:52:14 CST 2009
>Right. The low-level interface may be optional.
That would be an interesting choice for the API: provide several
levels of support and for each level provide an optional lower-level
API that can be used to control recovery more finely than would be
possible using the default error handlers. I think that we'll need an
explicit communicator rejoin API even when using built-in error
handlers but at least we won't force users to manually check error
codes. Having this double stack leaves me worried that the low-level
API will simply get dropped because the rest of the forum will not
see the need for something that complicated. However, I think we'll
have a few very persuasive arguments such as the fact that checking
for failures immediately after every collective will have a huge
performance hit, whereas giving control to the user will make it efficient.
Greg Bronevetsky
Post-Doctoral Researcher
1028 Building 451
Lawrence Livermore National Lab
(925) 424-5756
bronevetsky1 at llnl.gov
http://greg.bronevetsky.com
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mpi-forum.org/pipermail/mpiwg-ft/attachments/20090303/531c6613/attachment-0001.html>
More information about the mpiwg-ft
mailing list