[mpiwg-ft] Ticket 324 June 2015 Reading

Schulz Martin schulzm at llnl.gov
Tue Jun 2 00:06:27 CDT 2015

Hi Wesley,

Thanks for the summary - I think this describes it fairly well.

One additional comment: there was some discussion on having to deal with
situations where there is/was an associated communicator, but that
communicator has been freed before the fault happened. The question was
which communicator an Abort should be on in this case. Personally, I think
it should propagate upwards to the next communicator in the hierarchy,
worst case to COMM_WORLD, but other options exist as well. It would be
good, though, to clearly define this case.


Martin Schulz, schulzm at llnl.gov, http://scalability.llnl.gov/
CASC @ Lawrence Livermore National Laboratory, Livermore, USA

On 6/1/15, 10:00 PM, "Bland, Wesley" <wesley.bland at intel.com> wrote:

>Notes from the ticket reading are now posted on the wiki:
>TL;DR - The reading did not ³pass², but we got lots of good feedback to
>come back with a new version. We should consider splitting this into two
>or three tickets. One to define new errhandlers that does the new
>definition (abort communicator) and one that¹s a more well defined old
>definition (abort MPI_COMM_WORLD). Another ticket will deprecate
>MPI_COMM_ERRORS_ARE_FATAL. Another ticket will consolidate the
>definitions of all of the error handling text to a single place.
>The rest of the details can be found in the wiki.
>I¹ll be working on some drafts over the next few days to try to get new
>versions of this ticket out for discussion. My tentative hope is to get
>this ready for a new plenary in September. There¹s going to be enough
>changes that this should probably get a plenary before another reading.
>Comments welcome.
>mpiwg-ft mailing list
>mpiwg-ft at lists.mpi-forum.org

More information about the mpiwg-ft mailing list